CD33-Like Protein

ABSTRACT

The present invention concerns a novel CD33-like protein. In particular, isolated nucleic acid molecules are provided encoding the CD33-like protein. Recombinant CD33-like polypeptides are also provided as are recombinant vectors and host cells. The invention further provides methods useful during tumor or inflammatory disease diagnosis or prognosis and therapeutic treatments targeting cells expressing CD33-like polypeptides.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a divisional of U.S. application Ser. No.10/413,518, filed on Apr. 15, 2003, which is a divisional of U.S.application Ser. No. 08/896,537, filed on Jul. 18, 1997, now U.S. Pat.No. 6,590,088, which claims the benefit of the filing date ofprovisional application 60/022,481, filed on Jul. 19, 1996, each ofwhich is herein incorporated by reference.

STATEMENT UNDER 37 C.F.R. § 1.77(b)(5)

This application refers to a “Sequence Listing” listed below, which isprovided as a text document. The document is entitled“PF285D2_SeqList.txt” (21,890 bytes, created Jan. 17, 2008), and ishereby incorporated by reference in its entirety herein.

BACKGROUND OF THE INVENTION

1. Field of the Invention

The present invention relates to a novel CD33-like protein. Inparticular, isolated nucleic acid molecules are provided encoding theCD33-like protein. Recombinant CD33-like polypeptides are also providedas are recombinant vectors and host cells. The invention furtherprovides methods useful during tumor or inflammatory disease diagnosisor prognosis and therapeutic treatments targeting cells expressingCD33-like polypeptides.

2. Background Information

CD33 was originally defined on human myeloid cells by a panel ofmonoclonal antibodies (MoAbs) that recognize a glycoprotein of 67 kDthat is restricted in its expression to cells of the hematopoieticsystem (Peiper S. C. et al., in Knapp W, Dorken B, Gilks W R, Rieber EP, Schmidt R E, Stein H, von dem Borne A E G (eds): Leucocyte Typing IV.White Cell Differentiation Antigens. Oxford, UK, Oxford University 1989,p 814; but is first detected on a subpopulation of mixed colony-formingcells (Pierelli L. et al., Br J Haematol 84:24 (1993); Griffin J. D. etal., Leuk Res 8:521 (1984)). Expression then continues along themyelomonocytic pathway until it is downregulated on granulocytes butretained by monocytes and tissue macrophages. (Pierelli L. et al., Br JHaematol 84:24 (1993); Bernstein I. D. et al., J Clin Invest 79:1153(1987)). The expression pattern of CD33 within the hematopoietic systemindicates a potential role in the regulation of myeloid celldifferentiation. However, despite its initial identification over 10years ago (Andrews R. G. et al., Blood 62:124 (1983)), the functions andbinding properties of CD33 have remained obscure.

CD33 MoAbs are of great importance in the immunodiagnosis of acuteleukemias, allowing distinction between myeloid leukemic cells (acutemyeloid leukemia (AML) French-American-British classification MI-7) andthe usually CD33-negative cells of lymphoid origin. (Griffin J. D. etal., Leuk Res 8:521 (1984); Matutes E. et al., Haematol Oncol 3:179(1985); Bain B. J.: Immunological cytogenetics and other markers, inBain B J (ed): Leukaemia Diagnosis: A Guide to FAB Classification.London, UK, Gower Medical, 1990, p 61). This is especially valuable forthe more immature forms of AML, where morphologic criteria areinsufficient yet correct categorization is essential for prognosticpredictions and the choice of therapy. CD33 MoAbs have also been used inpreliminary therapeutic trials, principally for purging of the bonemarrow of AML patients, either before transplantation or in case ofdiseases that are resistant to chemotherapy. (Robertson M. J. et al.,Blood 79:2229 (1992); Applebaum F. R. et al., Transplantation 54: 829(1992); Caron P. C. et al., Cancer 73:1049 (1994)). Thus, due to theimportance of CD33, there is a clear need to identify and isolatenucleic acid molecules encoding additional polypeptides having CD33-likeprotein activity.

SUMMARY OF THE INVENTION

The present invention provides isolated nucleic acid moleculescomprising a nucleic acid sequence encoding a CD33-like protein whoseamino acid sequence is shown in FIGS. 1A-1C (SEQ ID NO:2) or a fragmentof the polypeptide. The CD33-like protein gene contains an open readingframe encoding a protein of about 551 amino acid residues whoseinitiation codon is at position 37-39 of the nucleotide sequence shownin FIGS. 1A-1C (SEQ ID NO:1), with a leader sequence of about 15 aminoacid residues, and a deduced molecular weight of about 60 kDa. The aminoacid sequence of the mature CD33-like protein is shown in FIGS. 1A-1C(amino acid residues from about 1 to about 536 in SEQ ID NO:2).

Thus, one aspect of the invention provides an isolated nucleic acidmolecule comprising a polynucleotide having a nucleotide sequenceselected from the group consisting of: (a) a nucleotide sequenceencoding the CD33-like protein having the complete amino acid sequencein SEQ ID NO:2; (b) a nucleotide sequence encoding the CD33-like proteinhaving the complete amino acid sequence in SEQ ID NO:2 but minus theN-terminal methionine residue; (c) a nucleotide sequence encoding themature CD33-like protein having the amino acid sequence at positionsfrom about 1 to about 536 in SEQ ID NO:2; (d) a nucleotide sequenceencoding the CD33-like protein having the complete amino acid sequenceencoded by the cDNA clone contained in ATCC Deposit No. 97521; (e) anucleotide sequence encoding the mature CD33-like protein having theamino acid sequence encoded by the cDNA clone contained in ATCC DepositNo. 97521; (f) a nucleotide sequence encoding the CD33-like proteinextracellular domain; (g) a nucleotide sequence encoding the CD33-likeprotein transmembrane domain; (h) a nucleotide sequence encoding theCD33-like protein intracellular domain; (i) a nucleotide sequenceencoding the CD33-like protein intracellular and extracellular domainswith all or part of the transmembrane domain deleted; and (j) anucleotide sequence complementary to any of the nucleotide sequences in(a), (b), (c), (d), (e), (f), (g), (h), or (i) above.

Further embodiments of the invention include isolated nucleic acidmolecules that comprise a polynucleotide having a nucleotide sequence atleast 95% identical, and more preferably at least 96%, 97%, 98% or 99%identical, to any of the nucleotide sequences in (a), (b), (c), (d),(e), (f), (g), (h), (i), or (j) above, or a polynucleotide whichhybridizes under stringent hybridization conditions to a polynucleotidein (a), (b), (c), (d), (e), (f), (g), (h), (i), or (j) above. Thispolynucleotide which hybridizes does not hybridize under stringenthybridization conditions to a polynucleotide having a nucleotidesequence consisting of only A residues or of only T residues. Anadditional nucleic acid embodiment of the invention relates to anisolated nucleic acid molecule comprising a polynucleotide which encodesthe amino acid sequence of an epitope-bearing portion of a CD33-likeprotein having an amino acid sequence in (a), (b), (c), (d), (e), (f),(g), (h), or (i) above.

The present invention also relates to vectors which include the isolatedDNA molecules of the present invention, host cells which are geneticallyengineered with the recombinant vectors, and the production of CD33-likepolypeptides or fragments thereof by recombinant techniques.

The polypeptides of the present invention include the polypeptideencoded by the deposited cDNA, the polypeptide of SEQ ID NO:2 (inparticular the mature polypeptide), as well as polypeptides having anamino acid sequence at least 95% identical, more preferably, at least96% or 99% identical, to the amino acid sequence of the polypeptideencoded by the deposited cDNA or the polypeptide of SEQ ID NO:2.

The invention further provides an isolated CD33-like protein having anamino acid sequence selected from the group consisting of: (a) the aminoacid sequence of the CD33-like protein having the complete 551 aminoacid sequence, including the leader sequence shown in SEQ ID NO:2; (b)the amino acid sequence of the CD33-like protein having the complete 551amino acid sequence, including the leader sequence shown in SEQ ID NO:2but minus the N-terminal methionine residue; (c) the amino acid sequenceof the mature CD33-like protein (without the leader) having the aminoacid sequence at positions from about 1 to about 536 in SEQ ID NO:2; (d)the amino acid sequence of the CD33-like protein having the completeamino acid sequence, including the leader, encoded by the cDNA clonecontained in ATCC Deposit No. 97521; (e) the amino acid sequence of themature CD33-like protein having the amino acid sequence encoded by thecDNA clone contained in ATCC Deposit No. 97521; (f) the amino acidsequence of the CD33-like protein extracellular domain; (g) the aminoacid sequence of the CD33-like protein transmembrane domain; (h) theamino acid sequence of the CD33-like protein intracellular domain; and(i) the amino acid sequence of the CD33-like protein intracellular andextracellular domains with all or part of the transmembrane domaindeleted.

An additional embodiment of this aspect of the invention relates to apeptide or polypeptide which has the amino acid sequence of anepitope-bearing portion of a CD33-like polypeptide having an amino acidsequence described in (a), (b), (c), (d), (e), (f), (g), (h), or (i)above. Peptides or polypeptides having the amino acid sequence of anepitope-bearing portion of a CD33-like polypeptide of the inventioninclude portions of such polypeptides with at least six or seven,preferably at least nine, and more preferably at least about 30 aminoacids to about 50 amino acids, although epitope-bearing polypeptides ofany length up to and including the entire amino acid sequence of apolypeptide of the invention described above also are included in theinvention. In another embodiment, the invention provides an isolatedantibody that binds specifically to a CD33-like polypeptide having anamino acid sequence described in (a), (b), (c), (d), (e), (f), (g), (h),or (i) above. Such antibodies are useful diagnostically ortherapeutically as described below.

The invention further provides a method useful during tumor orinflammatory disease diagnosis, which involves assaying the expressionlevel of the gene encoding the CD33-like protein or the gene copy numberin mammalian cells or body fluid and comparing the gene expression levelor gene copy number with a standard CD33-like protein gene expressionlevel or gene copy number, whereby an increase in the gene expressionlevel or gene copy number over the standard is indicative of certaintumors or inflammatory disease. By the invention, the above-describedmethod is further useful as a prognostic indicator.

In another embodiment, an in vitro method is provided for purgingleukemic hematopoietic cells from the autografts of patients withleukemia. The method involves removing CD33-like antigen-containinghematopoietic cells from bone marrow obtained from the patient with ananti-CD33-like protein monoclonal antibody (MoAb) and complement.

In a further embodiment, the invention provides an in vivo method forselectively killing or inhibiting growth of tumor cells expressing theCD33-like antigen of the present invention, which involves administeringto a patient an effective amount of an antagonist to inhibit theCD33-like protein receptor signaling pathway. By the invention,administering such antagonists of the CD33-like protein to a patient isalso useful for treating inflammatory disease.

In a still further embodiment, immunotoxins specific for cellsexpressing the CD33-like protein are provided for selective killing oftumor cells. The immunotoxins of the present invention are furtheruseful according to the method described above for purging leukemichematopoietic CD33⁺ cells in vitro.

BRIEF DESCRIPTION OF THE FIGURES

FIGS. 1A-1C shows the nucleotide (SEQ ID NO:1) and deduced amino acid(SEQ ID NO:2) sequences of the CD33-like protein. Amino acids from about1 to about 15 represent the signal peptide (first underlined sequence),amino acids from about 16 to about 422 the extracellular domain(sequence between the first and second underlined sequences) (aminoacids from about 1 to about 407 in SEQ ID NO:2), amino acids from about423 to about 464 the transmembrane domain (second underlined sequence)(amino acids from about 408 to about 449 in SEQ ID NO:2), and aminoacids from about 465 to about 551 the intracellular domain (theremaining sequence) (amino acids from about 450 to about 536 in SEQ IDNO:2).

FIG. 2 is an amino acid sequence comparison showing the regions ofsimilarity between the amino acid sequences of the CD33-like protein ofthe present invention (SEQ ID NO:2) and the human differentiationantigen CD33 (SEQ ID NO:3).

FIG. 3 is a Northern blot showing the tissue distribution of humanCD33-like protein mRNA expression. Expression was measured in thefollowing tissues: pancreas (lane 1), kidney (lane 2), skeletal muscle(lane 3), liver (lane 4), lung (lane 5), placenta (lane 6), brain (lane7), heart (lane 8), fetal liver (lane 9), bone marrow (lane 10),peripheral blood leucocytes (lane 11), appendix (lane 12), thymus (lane13), lymph node (lane 14), and spleen (lane 15).

DETAILED DESCRIPTION

The present invention provides isolated nucleic acid moleculescomprising a polynucleotide encoding a CD33-like protein having an aminoacid sequence shown in FIGS. 1A-1C (SEQ ID NO:2), which was determinedby sequencing a cloned cDNA. The CD33-like protein of the presentinvention shares sequence homology with the human differentiationantigen (CD33) (FIG. 2 (SEQ ID NO:3)). The nucleotide sequence shown inFIGS. 1A-1C (SEQ ID NO:1) was obtained by sequencing the HMQCD14 clone,which was deposited on Apr. 25, 1996 at the American Type CultureCollection, 10801 University Blvd., Manassas, Va. 20110-2209, USA, andgiven accession number 97521.

Nucleic Acid Molecules

Unless otherwise indicated, all nucleotide sequences determined bysequencing a DNA molecule herein were determined using an automated DNAsequencer (such as the Model 373 from Applied Biosystems, Inc.), and allamino acid sequences of polypeptides encoded by DNA molecules determinedherein were predicted by translation of a DNA sequence determined asabove. Therefore, as is known in the art for any DNA sequence determinedby this automated approach, any nucleotide sequence determined hereinmay contain some errors. Nucleotide sequences determined by automationare typically at least about 90% identical, more typically at leastabout 95% to at least about 99.9% identical to the actual nucleotidesequence of the sequenced DNA molecule. The actual sequence can be moreprecisely determined by other approaches including manual DNA sequencingmethods well known in the art. As is also known in the art, a singleinsertion or deletion in a determined nucleotide sequence compared tothe actual sequence will cause a frame shift in translation of thenucleotide sequence such that the predicted amino acid sequence encodedby a determined nucleotide sequence will be completely different fromthe amino acid sequence actually encoded by the sequenced DNA molecule,beginning at the point of such an insertion or deletion.

Unless otherwise indicated, each “nucleotide sequence” set forth hereinis presented as a sequence of deoxyribonucleotides (abbreviated A, G, Cand T). However, by “nucleotide sequence” of a nucleic acid molecule orpolynucleotide is intended, for a DNA molecule or polynucleotide, asequence of deoxyribonucleotides, and for an RNA molecule orpolynucleotide, the corresponding sequence of ribonucleotides (A, G, Cand U) where each thymidine deoxynucleotide (T) in the specifieddeoxynucleotide sequence is replaced by the ribonucleotide uridine (U).For instance, reference to an RNA molecule having the sequence of SEQ IDNO:1 set forth using deoxyribonucleotide abbreviations is intended toindicate an RNA molecule having a sequence in which each deoxynucleotideA, G, or C of SEQ ID NO:1 has been replaced by the correspondingribonucleotide A, G, or C, and each deoxynucleotide T has been replacedby a ribonucleotide U.

Thus, in one aspect, isolated nucleic acid molecules are provided whichencode the CD33-like protein. By “isolated” nucleic acid molecule(s) isintended a nucleic acid molecule, DNA or RNA, which has been removedfrom its native environment. For example, recombinant DNA moleculescontained in a vector are considered isolated for the purposes of thepresent invention. Further examples of isolated DNA molecules includerecombinant DNA molecules maintained in heterologous host cells orpurified (partially or substantially) DNA molecules in solution.Isolated RNA molecules include in vitro RNA transcripts of the DNAmolecules of the present invention. Isolated nucleic acid moleculesaccording to the present invention further include such moleculesproduced synthetically.

Using the information provided herein, such as the nucleotide sequenceset out in FIGS. 1A-1C (SEQ ID NO:1), a nucleic acid molecule of thepresent invention encoding a CD33-like protein may be obtained usingstandard cloning and screening procedures, such as those for cloningcDNAs using mRNA as starting material. Illustrative of the invention,the nucleic acid molecule described in FIGS. 1A-1C (SEQ ID NO:1) wasdiscovered in a cDNA library derived from human activated monocytes.Further, the gene was also found in cDNA libraries derived from thefollowing types of human cells: human eosinophils, spleen, chroniclymphocytic leukemia, human activated neutrophils, and human tonsils.

The CD33-like protein gene contains an open reading frame encoding afull-length protein of about 551 amino acid residues whose initiationcodon is at position 37-39 of the nucleotide sequence shown in FIGS.1A-1C (SEQ ID NO:1), and a predicted leader sequence of about 15 aminoacid residues, and a deduced molecular weight of about 60 kDa. The aminoacid sequence of the predicted mature CD33-like protein is shown inFIGS. 1A-1C from amino acid residue 16 to residue 551 (amino acids fromabout 1 to about 536 in SEQ ID NO:2). The mature CD33-like protein hasthree main structural domains. These include the extracellular domain,which includes the ligand binding domain, and is predicted to correspondto amino acid residues from about 16 to about 422 in FIGS. 1A-1C (aminoacids from about 1 to about 407 in SEQ ID NO:2). The matureextracellular domain is predicted to be about 407 amino acids in lengthwith a molecular weight of about 45 kDa. Another domain is thetransmembrane domain, which has been predicted to correspond to aboutresidues 423 to about 464 in FIGS. 1A-1C (amino acids from about 408 toabout 449 in SEQ ID NO:2). Another domain is the intracellular domain,which has been predicted to correspond to amino acid residue 465 toabout 551 in FIGS. 1A-1C (amino acids from about 450 to about 536 in SEQID NO:2). The CD33-like protein shown in FIGS. 1A-1C (SEQ ID NO:2) isabout 53% identical and about 64% similar to the human differentiationantigen CD33, which can be accessed on GenBank as Accession No. M23197.As one of ordinary skill would appreciate, due to the possibilities ofsequencing errors discussed above, as well as the variability ofcleavage sites for leaders in different known proteins, the actualfull-length CD33-like protein (including the leader) encoded by thedeposited cDNA comprises about 551 amino acids, but may be anywhere inthe range of about 545-560 amino acids; and the actual leader sequenceof this protein is about 15 amino acids, but may be anywhere in therange of about 12 to about 18 amino acids. It will also be appreciatedthat reasonable persons of skill in the art may disagree, depending onthe criteria used, concerning the exact ‘address’ of the above describedCD33-like protein domains. Thus, for example, the exact location of theCD33-like protein extracellular, intracellular and transmembrane domainsin FIGS. 1A-1C (SEQ ID NO:2) may vary slightly (e.g., the exact‘address’ may differ by about 1 to about 5 residues compared to thatshown in FIGS. 1A-1C (SEQ ID NO:2)) depending on the criteria used todefine the domain.

The present invention also provides the mature form(s) of the CD33-likeprotein of the present invention. According to the signal hypothesis,proteins secreted by mammalian cells have a signal or secretory leadersequence which is cleaved from the mature protein once export of thegrowing protein chain across the rough endoplasmic reticulum has beeninitiated. Most mammalian cells and even insect cells cleave secretedproteins with the same specificity. However, in some cases, cleavage ofa secreted protein is not entirely uniform, which results in two or moremature species on the protein. Further, it has long been known that thecleavage specificity of a secreted protein is ultimately determined bythe primary structure of the complete protein, that is, it is inherentin the amino acid sequence of the polypeptide. Therefore, the presentinvention provides a nucleotide sequence encoding the mature CD33-likeproteins having the amino acid sequence encoded by the cDNA clonecontained in the host identified as ATCC Deposit No. 97521 and as shownin SEQ ID NO:2. By the mature CD33-like protein having the amino acidsequence encoded by the cDNA clone contained in the host identified asATCC Deposit 97521 is meant the mature form(s) of the CD33-like proteinproduced by expression in a mammalian cell (e.g., COS cells, asdescribed below) of the complete open reading frame encoded by the humanDNA sequence of the clone contained in the vector in the deposited host.As indicated below, the mature CD33-like protein having the amino acidsequence encoded by the cDNA clone contained in ATCC Deposit No. 97521may or may not differ from the predicted “mature” CD33-like proteinshown in SEQ ID NO:2 (amino acids from about 1 to about 536) dependingon the accuracy of the predicted cleavage site based on computeranalysis.

Methods for predicting whether a protein has a secretory leader as wellas the cleavage point for that leader sequence are available. Forinstance, the methods of McGeoch, Virus Res. 3:271-286 (1985) and vonHeinje, Nucleic Acids Res. 14:4683-4690 (1986) can be used. The accuracyof predicting the cleavage points of known mammalian secretory proteinsfor each of these methods is in the range of 75-80%. von Heinje, supra.However, the two methods do not always produce the same predictedcleavage point(s) for a given protein.

In the present case, the predicted amino acid sequence of the completeCD33-like polypeptide of the present invention was analyzed by acomputer program (PSORT) (Nakai, K. and Kanehisa, M. Genomics 14:897-911(1992)), which is an expert system for predicting the cellular locationof a protein based on the amino acid sequence. As part of thiscomputational prediction of localization, the methods of McGeoch and vonHeinje are incorporated. The analysis by the PSORT program predicted thecleavage site between amino acids −1 and 1 in SEQ ID NO:2. Thereafter,the complete amino acid sequences were further analyzed by visualinspection, applying a simple form of the (−1, −3) rule of von Heinje.von Heinje, supra. Thus, the leader sequence for the CD33-like proteinis predicted to consist of amino acid residues from about −15 to about−1 in SEQ ID NO:2, while the mature CD33-like protein is predicted toconsist of residues from about 1 to about 536.

As indicated, nucleic acid molecules of the present invention may be inthe form of RNA, such as mRNA, or in the form of DNA, including, forinstance, cDNA and genomic DNA obtained by cloning or producedsynthetically. The DNA may be double-stranded or single-stranded.Single-stranded DNA may be the coding strand, also known as the sensestrand, or it may be the non-coding strand, also referred to as theanti-sense strand.

Isolated nucleic acid molecules of the present invention include DNAmolecules comprising an open reading frame (ORF) whose initiation codonis at position 37-39 of the nucleotide sequence shown in FIGS. 1A-1C(SEQ ID NO:1) and further include DNA molecules which comprise asequence substantially different than all or part of the ORF whoseinitiation codon is at position 37-39 of the nucleotide sequence shownin FIGS. 1A-1C (SEQ ID NO:1) but which, due to the degeneracy of thegenetic code, still encode the CD33-like protein or a fragment thereof.Of course, the genetic code is well known in the art. Thus, it would beroutine for one skilled in the art to generate the degenerate variantsdescribed above.

In addition, the invention provides a nucleic acid molecule having anucleotide sequence related to an extensive portion of SEQ ID NO:1. ThiscDNA clone is designated HTOBA14R (SEQ ID NO:11).

The sequence of a public EST, having GenBank Accession No. H71235,related to a portion of SEQ ID NO:1 is shown in SEQ ID NO:12. Thispublic EST is 433 nucleotides in length and contains a region of 111nucleotides having a sequence identical to nucleotides 1899 to 2009 ofthe sequence shown in SEQ ID NO:1 with the exception of two undisclosednucleotides at positions 12 and 22 in SEQ ID NO:12. These undisclosednucleotides are represented by the letter “N”.

In another aspect, the invention provides isolated nucleic acidmolecules encoding the CD33-like polypeptide having an amino acidsequence encoded by the cDNA of the clone deposited as ATCC Deposit No.97521 on Apr. 25, 1996. Preferably, the nucleic acid molecule willencode the mature polypeptide encoded by the above-described depositedcDNA.

The invention further provides an isolated nucleic acid molecule havingthe nucleotide sequence shown in FIGS. 1A-1C (SEQ ID NO:1) or thenucleotide sequence of the CD33-like protein gene contained in theabove-described deposited cDNA, or a nucleic acid molecule having asequence complementary to one of the above sequences. In a furtherembodiment, isolated nucleic acid molecules are provided encoding thefull-length CD33-like protein lacking the N-terminal methionine. Suchisolated molecules, particularly DNA molecules, are useful as probes forgene mapping by in situ hybridization with chromosomes and for detectingexpression of the CD33-like protein gene in human tissue by Northernblot analysis. As described in detail below, detecting enhancedCD33-like protein gene expression in certain tissues is indicative ofneoplasia.

The present invention is further directed to fragments of the isolatednucleic acid molecules described herein. By a fragment of an isolatednucleic acid molecule having the nucleotide sequence of the depositedcDNA or the nucleotide sequence shown in FIGS. 1A-1C (SEQ ID NO:1) isintended fragments at least about 15 nt, and more preferably at leastabout 20 nt, still more preferably at least about 30 nt, and even morepreferably, at least about 40 nt in length which are useful asdiagnostic probes and primers as discussed herein. Of course larger DNAfragments 50, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600,650, 700, 750, 800, 850, 900, 100, 1050, 1100, 1150, 1200, 1250, 1300,1350, 1400, 1450, 1500, 1550, 1600, 1650, 1700, 1750, 1800, 1850, 1900,1950, 2000, or 2010 nt in length are also useful according to thepresent invention as are fragments corresponding to most, if not all, ofthe nucleotide sequence of the deposited cDNA or as shown in FIGS. 1A-1C(SEQ ID NO:1). By a fragment at least 20 nt in length, for example, isintended fragments which include 20 or more contiguous bases from thenucleotide sequence of the deposited cDNA or the nucleotide sequence asshown in FIGS. 1A-1C (SEQ ID NO:1). Since the gene has been depositedand the nucleotide sequence shown in FIGS. 1A-1C (SEQ ID NO:1) isprovided, generating such DNA fragments would be routine to the skilledartisan. For example, restriction endonuclease cleavage or shearing bysonication could easily be used to generate fragments of various sizes.Alternatively, such fragments could be generated synthetically.

Preferred nucleic acid fragments of the present invention includenucleic acid molecules encoding: a polypeptide comprising the CD33-likeprotein extracellular domain (amino acid residues from about 16 to about422 in FIGS. 1A-1C (amino acids from about 1 to about 407 in SEQ IDNO:2)); a polypeptide comprising the CD33-like protein transmembranedomain (amino acid residues from about 423 to about 464 in FIGS. 1A-1C(amino acids from about 408 to about 449 in SEQ ID NO:2)); a polypeptidecomprising the CD33-like protein intracellular domain (amino acidresidues from about 465 to about 551 in FIGS. 1A-1C (amino acids fromabout 450 to about 536 in SEQ ID NO:2)); a polypeptide comprising theCD33-like protein extracellular and intracellular domain having all orpart of the transmembrane region deleted.

In another aspect, the invention provides an isolated nucleic acidmolecule comprising a polynucleotide which hybridizes under stringenthybridization conditions to a portion of the polynucleotide in a nucleicacid molecule of the invention described above, for instance, the cDNAclone contained in ATCC Deposit 97521. By “stringent hybridizationconditions” is intended overnight incubation at 42° C. in a solutioncomprising: 50% formamide, 5×SSC (750 mM NaCl, 75 mM tridosium citrate),50 mM sodium phosphate (pH7.6), 5×Denhardt's solution, 10% dextransulfate, and 20 μg/ml denatured, sheared salmon sperm DNA, followed bywashing the filters in 0.1×SSC at about 65° C. By a polynucleotide whichhybridizes to a “portion” of a polynucleotide is intended apolynucleotide (either DNA or RNA) hybridizing to at least about 15nucleotides (nt), and more preferably at least about 20 nt, still morepreferably at least about 30 nt, and even more preferably about 30-70 ntof the reference polynucleotide. These are useful as diagnostic probesand primers as discussed above and in more detail below.

Of course, polynucleotides hybridizing to a larger portion of thereference polynucleotide (e.g., the deposited cDNA clone), for instance,a portion 50-750 nt in length, or even to the entire length of thereference polynucleotide, are also useful as probes according to thepresent invention, as are polynucleotides corresponding to most, if notall, of the nucleotide sequence of the deposited cDNA or the nucleotidesequence as shown in FIGS. 1A-1C (SEQ ID NO:1). By a portion of apolynucleotide of “at least 20 nt in length,” for example, is intended20 or more contiguous nucleotides from the nucleotide sequence of thereference polynucleotide (e.g., the deposited cDNA or the nucleotidesequence as shown in FIGS. 1A-1C (SEQ ID NO:1)). As indicated, suchportions are useful diagnostically either as a probe according toconventional DNA hybridization techniques or as primers foramplification of a target sequence by the polymerase chain reaction(PCR), as described, for instance, in Molecular Cloning, A LaboratoryManual, 2nd. edition, edited by Sambrook, J., Fritsch, E. F. andManiatis, T., (1989), Cold Spring Harbor Laboratory Press, the entiredisclosure of which is hereby incorporated herein by reference.

Of course, a polynucleotide which hybridizes only to a poly A sequence(such as the 3′ terminal poly(A) tract of the CD33-like cDNA shown inFIGS. 1A-1C (SEQ ID NO:1)), or to a complementary stretch of T (or U)resides, would not be included in a polynucleotide of the invention usedto hybridize to a portion of a nucleic acid of the invention, since sucha polynucleotide would hybridize to any nucleic acid molecule containinga poly (A) stretch or the complement thereof (e.g., practically anydouble-stranded cDNA clone).

As indicated, nucleic acid molecules of the present invention whichencode the CD33-like polypeptide may include, but are not limited to thecoding sequence for the mature polypeptide, by itself; the codingsequence for the mature polypeptide and additional sequences, such asthose encoding the about 15 amino acid leader or secretory sequence,such as a pre-, or pro- or prepro-protein sequence; the coding sequenceof the mature polypeptide, with or without the aforementioned additionalcoding sequences, together with additional, non-coding sequences,including for example, but not limited to introns and non-coding 5′ and3′ sequences, such as the transcribed, non-translated sequences thatplay a role in transcription, mRNA processing—including splicing andpolyadenylation signals, for example—ribosome binding and stability ofmRNA; additional coding sequence which codes for additional amino acids,such as those which provide additional functionalities. Thus, forinstance, the sequence encoding the polypeptide may be fused to a markersequence, such as a sequence encoding a peptide, which facilitatespurification of the fused polypeptide. In certain preferred embodimentsof this aspect of the invention, the marker sequence is a hexa-histidinepeptide, such as the tag provided in a pQE vector (Qiagen, Inc.), amongothers, many of which are commercially available. As described in Gentzet al., Proc. Natl. Acad. Sci. USA 86:821-824 (1989), for instance,hexa-histidine provides for convenient purification of the fusionprotein. The HA tag is another peptide useful for purification whichcorresponds to an epitope derived of influenza hemagglutinin protein,which has been described by Wilson et al., Cell 37:767 (1984), forinstance. As discussed below, other such fusion proteins include theCD33-like protein fused to Fc at the N- or C-terminus.

The present invention further relates to variants of the nucleic acidmolecules of the present invention, which encode fragments, analogs orderivatives of the CD33-like protein. Variants may occur naturally, suchas an allelic variant. By an “allelic variant” is intended one ofseveral alternate forms of a gene occupying a given locus on achromosome of an organism. Lewin, B., ed., Genes II, John Wiley & Sons,New York (1985). Non-naturally occurring variants may be produced usingart-known mutagenesis techniques.

Such variants include those produced by nucleotide substitutions,deletions or additions. The substitutions, deletions or additions mayinvolve one or more nucleotides. The variants may be altered in codingor non-coding regions or both. Alterations in the coding regions mayproduce conservative or non-conservative amino acid substitutions,deletions or additions.

Especially preferred among these are silent substitutions, additions anddeletions, which do not alter the properties and activities of theCD33-like protein or portions thereof. Also especially preferred in thisregard are conservative substitutions.

Further embodiments of the invention include isolated nucleic acidmolecules comprising a polynucleotide having a nucleotide sequence atleast 95% identical, and more preferably at least 96%, 97%, 98% or 99%identical to: (a) a nucleotide sequence encoding the full-lengthCD33-like protein having the complete amino acid sequence (including thepredicted leader sequence) shown in FIGS. 1A-1C (SEQ ID NO:2) or asencoded by the cDNA clone contained in ATCC Deposit No. 97521; (b) anucleotide sequence encoding the protein having the amino acid sequencein FIGS. 1A-1C (SEQ ID NO:2), but lacking the N-terminal methionine; (c)a nucleotide sequence encoding the mature CD33 protein (the full-lengthpolypeptide with the leader removed) having an amino acid sequence shownin FIGS. 1A-1C (SEQ ID NO:2) or as encoded by the cDNA clone containedin ATCC Deposit No. 97521; (d) a nucleotide sequence encoding theCD33-like protein extracellular domain having an amino acid sequenceshown in FIGS. 1A-1C (amino acids from about 1 to about 407 in SEQ IDNO:2) or as encoded by the cDNA clone contained in ATCC Deposit No.97521; (e) a nucleotide sequence encoding the CD33-like proteinintracellular domain having an amino acid sequence shown in FIGS. 1A-1C(amino acids from about 450 to about 536 in SEQ ID NO:2) or as encodedby the cDNA clone contained in ATCC Deposit No. 97521; (f) a nucleotidesequence encoding the CD33-like protein transmembrane domain having anamino acid sequence shown in FIGS. 1A-1C (amino acids from about 408 toabout 449 in SEQ ID NO:2) or as encoded by the cDNA clone contained inATCC Deposit No. 97521; (g) a nucleotide sequence encoding the CD33-likeprotein extracellular domain and intracellular domain (with all or partof the transmembrane domain deleted) having an amino acid sequence shownin FIGS. 1A-1C (SEQ ID NO:2) or as encoded by the cDNA clone containedin ATCC Deposit No. 97521; or (h) a nucleotide sequence complementary toany of the nucleotide sequences of (a)-(g).

By a polynucleotide having a nucleotide sequence at least, for example,95% “identical” to a reference nucleotide sequence encoding a CD33-likeprotein is intended that the nucleotide sequence of the polynucleotideis identical to the reference sequence except that the polynucleotidesequence may include up to five point mutations per each 100 nucleotidesof the reference nucleotide sequence encoding the CD33-like polypeptide.In other words, to obtain a polynucleotide having a nucleotide sequenceat least 95% identical to a reference nucleotide sequence, up to 5% ofthe nucleotides in the reference sequence may be deleted or substitutedwith another nucleotide, or a number of nucleotides up to 5% of thetotal nucleotides in the reference sequence may be inserted into thereference sequence. These mutations of the reference sequence may occurat the 5′ or 3′ terminal positions of the reference nucleotide sequenceor anywhere between those terminal positions, interspersed eitherindividually among nucleotides in the reference sequence or in one ormore contiguous groups within the reference sequence.

As a practical matter, whether any particular nucleic acid molecule hasa nucleotide sequence at least 95%, 97%, 98% or 99% identical to, forinstance, the nucleotide sequence shown in FIGS. 1A-1C (SEQ ID NO:1) orto the nucleotide sequence of the deposited cDNA clone can be determinedconventionally using known computer programs such as the Bestfit®program (Wisconsin Sequence Analysis Package, Version 8 for Unix,Genetics Computer Group, University Research Park, 575 Science Drive,Madison, Wis. 53711). Bestfit® uses the local homology algorithm ofSmith and Waterman (Advances in Applied Mathematics 2:482-489, 1981) tofind the best segment of homology between two sequences. When usingBestfit® or any other sequence alignment program to determine whether aparticular sequence is, for instance, 95% identical to a referencesequence according to the present invention, the parameters are set, ofcourse, such that the percentage of identity is calculated over the fulllength of the reference nucleotide sequence and that gaps in homology ofup to 5% of the total number of nucleotides in the reference sequenceare allowed.

The present application is directed to such nucleic acid molecules whichare at least 95%, 97%, 98% or 99% identical to a nucleic acid sequencedescribed above irrespective of whether they encode a polypeptide havingCD33-like protein activity. This is because, even where a particularnucleic acid molecule does not encode a polypeptide having CD33-likeprotein activity, one of skill would still know how to use the nucleicacid molecule, for instance, as a hybridization probe or a polymerasechain reaction (PCR) primer. Uses of the nucleic acid molecules of thepresent invention that do not encode a polypeptide having CD33-likeprotein activity include, inter alia, (1) isolating the gene encodingthe CD33-like protein, or allelic variants thereof from a cDNA library;(2) in situ hybridization (FISH) to metaphase chromosomal spreads toprovide precise chromosomal location of the CD33-like gene as describedin Verma et al., Human Chromosomes: a Manual of Basic Techniques,Pergamon Press, New York (1988); and (3) Northern blot analysis fordetecting expression of CD33-like mRNA in specific tissues.

Preferred, however, are nucleic acid molecules which are at least 95%,97%, 98% or 99% identical to a nucleic acid sequence described abovewhich do, in fact, encode a polypeptide having CD33-like proteinactivity. By “a polypeptide having CD33-like protein activity” isintended polypeptides exhibiting activity similar, but not necessarilyidentical, to an activity of the CD33-like polypeptide of the inventionas measured in a particular biological assay.

For example, in a solid-phase binding assay, the CD33-like protein ofthe present invention can mediate sialic acid-dependent adhesion to redblood cells (RBC) in a manner similar to CD33 and sialoadhesin. (Thisassay is described in detail in Freeman, S. D., et al., Blood 85(8):2005-2012 (1995), the contents of which are incorporated herein byreference). In particular, human red blood cells (RBC) can be modifiedenzymatically to carry sialic acids in unique linkages (Paulson J C., etal., Methods Enzymol. 138:162 (1987)). This provides a usefulexperimental approach to characterize the specificity of sialicacid-dependent adhesion molecules. Briefly, the assay involvesgenerating Fc-CD33-like protein by polymerase chain reactionamplification of the extracellular portion of the CD33-like protein cDNAdescribed above and cloning into the Fc expression vector, pIG, (SimmonsD L: Cloning cell surface molecules by transient expression in mammaliancells, in Hartley D A (ed): Cellular Interactions in Development-APractical Approach. Oxford, UK, IRL, 1993, p 93.) followed by expressionin COS-1 cells as described in Freeman, S. D., et al., 85 (8):2005-2012(1995)). The COS cell supernatants are harvested at about 6 daysposttransfection, and the Fc-CD33-like protein purified on protein ASepharose® as described in Simmons D L, supra.

The solid-phase binding assay involves coating enzyme-linkedimmunosorbent assay plates the with Fc-CD-33 like protein as describedin Kelm, S. et al., Curr Biol 4:965 (1994) and adding radiolabeled RBCbearing sialic acid residues in one or of more the structures describedin Freeman, S. D. et al., Blood 85 (8):2005-2012 (1995) at anappropriate concentration (such as, for example, 4×10⁶ cell/mL). After30 minutes, nonadherent and loosely adherent cells are removed bywashing. The percentage of cells binding in each well is determined asfollows: (cpm bound/cpm input) X 100. The CD33-like polypeptide of thepresent invention will bind RBC in a sialic acid-dependent manner.

Thus, by the invention, a “polypeptide having CD33-like proteinactivity” includes polypeptides that also bind RBC in theabove-described assay in a sialic acid-dependent manner. In other words,in a side-by-side comparison, the percentage of RBC binding as describedabove using the CD33-like protein will be similar (i.e., not more than a50% difference, and preferably, not more than a 25% difference) to thatoccurring using a candidate “polypeptide having CD33-like proteinactivity.”

In another embodiment, the above-described binding assay is useful forscreening potential antagonist and agonist of CD33-like proteinactivity. The method involves determining whether a candidate agonist orantagonist (such as an anti-CD33-like protein antibody) enhances orinhibits sialic acid-dependent binding of the CD33-like protein to RBCrelative to a standard binding level, i.e., the degree of CD33-likeprotein binding to RBC in the absence of the candidate agonist orantagonist.

Of course, due to the degeneracy of the genetic code, one of ordinaryskill in the art will immediately recognize that large number of thenucleic acid molecules having a nucleotide sequence at least 95%, 97%,98% or 99% identical to a nucleic acid sequence described above willencode a polypeptide “having CD33-like protein activity.” In fact, sincedegenerate variants all encode the same polypeptide, this will be clearto the skilled artisan even without performing the above-describedcomparison assay. It will be further recognized in the art that, forsuch nucleic acid molecules that are not degenerate variants, areasonable number will also encode a polypeptide having CD33-likeprotein activity. This is because the skilled artisan is fully aware ofamino acid substitutions that are either less likely or not likely tosignificantly effect protein function (e.g., replacing one aliphaticamino acid with a second aliphatic amino acid).

For example, guidance concerning how to make phenotypically silent aminoacid substitutions is provided in Bowie, J. U., et al., “Deciphering theMessage in Protein Sequences: Tolerance to Amino Acid Substitutions,”Science 247:1306-1310 (1990), wherein the authors indicate that thereare two main approaches for studying the tolerance of an amino acidsequence to change. The first method relies on the process of evolution,in which mutations are either accepted or rejected by natural selection.The second approach uses genetic engineering to introduce amino acidchanges at specific positions of a cloned gene and selects or screens toidentify sequences that maintain functionality. As the authors state,these studies have revealed that proteins are surprisingly tolerant ofamino acid substitutions. The authors further indicate which amino acidchanges are likely to be permissive at a certain position of theprotein. For example, most buried amino acid residues require nonpolarside chains, whereas few features of surface side chains are generallyconserved. Other such phenotypically silent substitutions are describedin Bowie, J. U. et al., supra, and the references cited therein.

Vectors and Host Cells

The present invention also relates to vectors which include the isolatedDNA molecules of the present invention, host cells which are geneticallyengineered with the recombinant vectors, and the production of CD33-likepolypeptides or fragments thereof by recombinant techniques.

Recombinant constructs may be introduced into host cells using wellknown techniques such as infection, transduction, transfection,transvection and transformation. The vector may be, for example, aplasmid, viral or retroviral vector. Retroviral vectors may bereplication competent or replication defective. In the latter case,retroviral propagation generally will occur only in complementing hostcells.

The polynucleotides may be joined to a vector containing a selectablemarker for propagation in a host. Generally, a plasmid vector isintroduced in a precipitate, such as a calcium phosphate precipitate, orin a complex with a charged lipid. If the vector is a virus, it may bepackaged in vitro using an appropriate packaging cell line and thentransfected into host cells.

Preferred are vectors comprising cis-acting control regions to thepolynucleotide of interest. Appropriate trans-acting factors either aresupplied by the host, supplied by a complementing vector or supplied bythe vector itself upon introduction into the host.

In certain preferred embodiments in this regard, the vectors provide forspecific expression, which be inducible and/or cell type-specific.Particularly preferred among inducible vectors are vectors that can beinduced for expression by environmental factors that are easy tomanipulate, such as temperature and nutrient additives.

Expression vectors useful in the present invention include chromosomal-,episomal- and virus-derived vectors e.g., vectors derived from bacterialplasmids, bacteriophage, yeast episomes, yeast chromosomal elements,viruses such as baculoviruses, papova viruses, vaccinia viruses,adenoviruses, fowl pox viruses, pseudorabies viruses and retroviruses,and vectors derived from combinations thereof, such as those derivedfrom plasmid and bacteriophage genetic elements, such as cosmids andphagemids.

The DNA insert should be operatively linked to an appropriate promoter,such as the phage lambda PL promoter, the E. coli lac, trp and tacpromoters, the SV40 early and late promoters and promoters of retroviralLTRs, to name a few. Other suitable promoters will be known to theskilled artisan. The expression constructs will contain sites fortranscription initiation, termination, and, in the transcribed region, aribosome binding site for translation. The coding portion of the maturetranscripts expressed by the constructs will include a translationinitiating AUG at the beginning and a termination codon appropriatelypositioned at the end of the polypeptide to be translated.

As indicated, the expression vectors will preferably contain selectablemarkers. Such markers include dihydrofolate reductase or neomycinresistance for eukaryotic cell culture and tetracycline or ampicillinresistance for culturing in E. coli and other bacteria. Representativeexamples of appropriate hosts include bacterial cells, such as E. coli,Streptomyces and Salmonella typhimurium, fungal cells, such as yeast;insect cells such as Drosophila S2 and Spodoptera Sf9; animal cells suchas CHO, COS and Bowes melanoma; and plant cells. Appropriate culturemediums and conditions for the above-described host cells are known inthe art.

Among vectors preferred for use in bacteria are pQE70, pQE60 and pQE-9,available from Qiagen; pBS vectors, Phagescript vectors, Bluescript®vectors, pNH8A, pNH16a, pNH18A, pNH46A, available from Stratagene; andptrc99a, pKK223-3, pKK233-3, pDR540, pRIT5 available from Pharmacia.Among preferred eukaryotic vectors are pWLNEO, pSV2CAT, pOG44, pXT1 andpSG available from Stratagene; and pSVK3, pBPV, pMSG and pSVL availablefrom Pharmacia. Other suitable vectors will be readily apparent to theskilled artisan.

Among known bacterial promoters for use in the present invention includeE. coli lacI and lacZ promoters, the T3 and T7 promoters, the gptpromoter, the lambda PR, PL promoters and the trp promoter. Suitableeukaryotic promoters include the CMV immediate early promoter, the HSVthymidine kinase promoter, the early and late SV40 promoters, thepromoters of retroviral LTRs, such as those of the Rous sarcoma virus(“RSV”), and metallothionein promoters, such as the mousemetallothionein-I promoter.

Introduction of the construct into the host cell can be effected bycalcium phosphate transfection, DEAE-dextran mediated transfection,cationic lipid-mediated transfection, electroporation, transduction,infection or other methods. Such methods are described in many standardlaboratory manuals, such as Davis et al., Basic Methods in MolecularBiology (1986).

Transcription of the DNA encoding the polypeptides of the presentinvention by higher eukaryotes may be increased by inserting an enhancersequence into the vector. Enhancers are cis-acting elements of DNA,usually about from 10 to 300 bp that act to increase transcriptionalactivity of a promoter in a given host cell-type. Examples of enhancersinclude the SV40 enhancer, which is located on the late side of thereplication origin at bp 100 to 270, the cytomegalovirus early promoterenhancer, the polyoma enhancer on the late side of the replicationorigin, and adenovirus enhancers.

For secretion of the translated protein into the lumen of theendoplasmic reticulum, into the periplasmic space or into theextracellular environment, appropriate secretion signals may beincorporated into the expressed polypeptide. The signals may beendogenous to the polypeptide or they may be heterologous signals.

The polypeptide may be expressed in a modified form, such as a fusionprotein, and may include not only secretion signals but also additionalheterologous functional regions. Thus, for instance, a region ofadditional amino acids, particularly charged amino acids, may be addedto the N-terminus of the polypeptide to improve stability andpersistence in the host cell, during purification or during subsequenthandling and storage. Also, peptide moieties may be added to thepolypeptide to facilitate purification. Such regions may be removedprior to final preparation of the polypeptide. The addition of peptidemoieties to polypeptides to engender secretion or excretion, to improvestability and to facilitate purification, among others, are familiar androutine techniques in the art. A preferred fusion protein comprises aheterologous region from immunoglobulin that is useful to solubilizeproteins. For example, EPA 0 464 533 (Canadian counterpart 2045869)discloses fusion proteins comprising various portions of constant regionof immunoglobin molecules together with another human protein or partthereof. In many cases, the Fc part in a fusion protein is thoroughlyadvantageous for use in therapy and diagnosis and thus results, forexample, in improved pharmacokinetic properties (EPA 0 232 262). On theother hand, for some uses it would be desirable to be able to delete theFc part after the fusion protein has been expressed, detected andpurified in the advantageous manner described. This is the case when theFc portion proves to be a hindrance to use in therapy and diagnosis, forexample when the fusion protein is to be used as an antigen forimmunizations. In drug discovery, for example, human proteins, such asthe hIL5-receptor, have been fused with Fc portions for the purpose ofhigh-throughput screening assays to identify antagonists of hIL-5. See,D. Bennett et al., J. of Molec. Recognition 8:52-58 (1995) and K.Johanson et al., J. Biol. Chem. 270:9459-9471 (1995).

The CD33-like protein can be recovered and purified from recombinantcell cultures by well-known methods including ammonium sulfate orethanol precipitation, acid extraction, anion or cation exchangechromatography, phosphocellulose chromatography, hydrophobic interactionchromatography, affinity chromatography, hydroxylapatite chromatographyand lectin chromatography. Most preferably, high performance liquidchromatography (“HPLC”) is employed for purification.

Polypeptides of the present invention include naturally purifiedproducts, products of chemical synthetic procedures, and productsproduced by recombinant techniques from a prokaryotic or eukaryotichost, including, for example, bacterial, yeast, higher plant, insect andmammalian cells. Depending upon the host employed in a recombinantproduction procedure, the polypeptides of the present invention may beglycosylated or may be non-glycosylated. In addition, polypeptides ofthe invention may also include an initial modified methionine residue,in some cases as a result of host-mediated processes.

CD33-Like Polypeptides and Fragments

The invention further provides an isolated CD33-like polypeptide havingthe amino acid sequence encoded by the deposited cDNA, or the amino acidsequence as shown in FIGS. 1A-1C (SEQ ID NO:2), or a peptide orpolypeptide comprising a portion of the above polypeptides. The terms“peptide” and “oligopeptide” are considered synonymous (as is commonlyrecognized) and each term can be used interchangeably as the contextrequires to indicate a chain of at least two amino acids coupled bypeptidyl linkages. The word “polypeptide” is used herein for chainscontaining more than ten amino acid residues. All oligopeptide andpolypeptide formulas or sequences herein are written from left to rightand in the direction from amino terminus to carboxy terminus.

It will be recognized in the art that some amino acid sequences of theCD33-like polypeptide can be varied without significant effect on thestructure or function of the protein. If such differences in sequenceare contemplated, it should be remembered that there will be criticalareas on the protein which determine activity.

Thus, the invention further includes variations of the CD33-likepolypeptide which show substantial CD33-like polypeptide activity orwhich include regions of CD33-like protein such as the protein portionsdiscussed below. Such mutants include deletions, insertions, inversions,repeats, and type substitutions. As indicated above, guidance concerningwhich amino acid changes are likely to be phenotypically silent can befound in Bowie, J. U., et al., “Deciphering the Message in ProteinSequences: Tolerance to Amino Acid Substitutions,” Science 247:1306-1310(1990).

Thus, the fragment, derivative or analog of the polypeptide of SEQ IDNO:2, or that encoded by the deposited cDNA, may be (i) one in which oneor more of the amino acid residues are substituted with a conserved ornon-conserved amino acid residue (preferably a conserved amino acidresidue) and such substituted amino acid residue may or may not be oneencoded by the genetic code, or (ii) one in which one or more of theamino acid residues includes a substituent group, or (iii) one in whichthe mature polypeptide is fused with another compound, such as acompound to increase the half-life of the polypeptide (for example,polyethylene glycol), or (iv) one in which the additional amino acidsare fused to the mature polypeptide, such as an IgG Fc fusion regionpeptide or leader or secretory sequence or a sequence which is employedfor purification of the mature polypeptide or a proprotein sequence.Such fragments, derivatives and analogs are deemed to be within thescope of those skilled in the art from the teachings herein.

Of particular interest are substitutions of charged amino acids withanother charged amino acid and with neutral or negatively charged aminoacids. The latter results in proteins with reduced positive charge toimprove the characteristics of the CD33-like protein. The prevention ofaggregation is highly desirable. Aggregation of proteins not onlyresults in a loss of activity but can also be problematic when preparingpharmaceutical formulations, because they can be immunogenic. (Pinckardet al., Clin. Exp. Immunol. 2:331-340 (1967); Robbins et al., Diabetes36:838-845 (1987); Cleland et al. Crit. Rev. Therapeutic Drug CarrierSystems 10:307-377 (1993)).

As indicated, changes are preferably of a minor nature, such asconservative amino acid substitutions that do not significantly affectthe folding or activity of the protein (see Table 1).

TABLE 1 Conservative Amino Acid Substitutions. Aromatic PhenylalanineTryptophan Tyrosine Hydrophobic Leucine Isoleucine Valine PolarGlutamine Asparagine Basic Arginine Lysine Histidine Acidic AsparticAcid Glutamic Acid Small Alanine Serine Threonine Methionine Glycine

Amino acids in the CD33-like protein of the present invention that areessential for function can be identified by methods known in the art,such as site-directed mutagenesis or alanine-scanning mutagenesis(Cunningham and Wells, Science 244:1081-1085 (1989)). The latterprocedure introduces single alanine mutations at every residue in themolecule. Sites that are critical for ligand binding can also bedetermined by structural analysis such as crystallization, nuclearmagnetic resonance or photoaffinity labeling (Smith et al., J. Mol.Biol. 224:899-904 (1992) and de Vos et al., Science 255:306-312 (1992)).

The polypeptides of the present invention are preferably provided in anisolated form. By “isolated polypeptide” is intended a polypeptideremoved from its native environment. Thus, a polypeptide produced and/orcontained within a recombinant host cell is considered isolated forpurposes of the present invention. Also intended as an “isolatedpolypeptide” are polypeptides that have been purified, partially orsubstantially, from a recombinant host cell. For example, arecombinantly produced version of the CD33-like protein polypeptide canbe substantially purified by the one-step method described in Smith andJohnson, Gene 67:31-40 (1988).

The polypeptides of the present invention include the polypeptideencoded by the deposited cDNA including the leader; the maturepolypeptide encoded by the deposited cDNA minus the leader (i.e., themature protein); a polypeptide comprising amino acids about −15 to about536 in SEQ ID NO:2; a polypeptide comprising amino acids about −14 toabout 536 in SEQ ID NO:2; a polypeptide comprising amino acids about 1to about 536 in SEQ ID NO:2; a polypeptide comprising amino acids about1 to about 407 in SEQ ID NO:2; a polypeptide comprising amino acidsabout 408 to about 449 in SEQ ID NO:2; a polypeptide comprising aminoacids about 450 to about 536 in SEQ ID NO:2; a polypeptide comprisingthe CD33-like polypeptide extracellular and intracellular domains withall or part of the transmembrane domain deleted; as well as polypeptideswhich are at least 95% identical, more preferably at least 96%, 97%, 98%or 99% identical to the polypeptide encoded by the deposited cDNA, tothe polypeptide of SEQ ID NO:2, and also include portions of suchpolypeptides with at least 30 amino acids and more preferably at least50 amino acids.

By a polypeptide having an amino acid sequence at least, for example,95% “identical” to a reference amino acid sequence of a CD33-likepolypeptide is intended that the amino acid sequence of the polypeptideis identical to the reference sequence except that the polypeptidesequence may include up to five amino acid alterations per each 100amino acids of the reference amino acid of the CD33-like polypeptide. Inother words, to obtain a polypeptide having an amino acid sequence atleast 95% identical to a reference amino acid sequence, up to 5% of theamino acid residues in the reference sequence may be deleted orsubstituted with another amino acid, or a number of amino acids up to 5%of the total amino acid residues in the reference sequence may beinserted into the reference sequence. These alterations of the referencesequence may occur at the amino or carboxy terminal positions of thereference amino acid sequence or anywhere between those terminalpositions, interspersed either individually among residues in thereference sequence or in one or more contiguous groups within thereference sequence.

As a practical matter, whether any particular polypeptide is at least95%, 97%, 98% or 99% identical to, for instance, the amino acid sequenceshown in FIGS. 1A-1C (SEQ ID NO:2) or to the amino acid sequence encodedby deposited cDNA clone or a portion thereof can be determinedconventionally using known computer programs such the Bestfit® program(Wisconsin Sequence Analysis Package, Version 8 for Unix, GeneticsComputer Group, University Research Park, 575 Science Drive, Madison,Wis. 53711). When using Bestfit® or any other sequence alignment programto determine whether a particular sequence is, for instance, 95%identical to a reference sequence according to the present invention,the parameters are set, of course, such that the percentage of identityis calculated over the full length of the reference amino acid sequenceand that gaps in homology of up to 5% of the total number of amino acidresidues in the reference sequence are allowed.

The polypeptide of the present invention are useful as a molecularweight marker on SDS-PAGE gels or on molecular sieve gel filtrationcolumns using methods well known to those of skill in the art.

In another aspect, the invention provides a peptide or polypeptidecomprising an epitope-bearing portion of a polypeptide of the invention.The epitope of this polypeptide portion is an immunogenic or antigenicepitope of a polypeptide of the invention. An “immunogenic epitope” isdefined as a part of a protein that elicits an antibody response whenthe whole protein is the immunogen. These immunogenic epitopes arebelieved to be confined to a few loci on the molecule. On the otherhand, a region of a protein molecule to which an antibody can bind isdefined as an “antigenic epitope.” The number of immunogenic epitopes ofa protein generally is less than the number of antigenic epitopes. See,for instance, Geysen, H. M. et al., Proc. Natl. Acad. Sci. USA81:3998-4002 (1984).

As to the selection of peptides or polypeptides bearing an antigenicepitope (i.e., that contain a region of a protein molecule to which anantibody can bind), it is well known in that art that relatively shortsynthetic peptides that mimic part of a protein sequence are routinelycapable of eliciting an antiserum that reacts with the partiallymimicked protein. See, for instance, Sutcliffe, J. G. et al., Science219:660-666 (1984). Peptides capable of eliciting protein-reactive seraare frequently represented in the primary sequence of a protein, can becharacterized by a set of simple chemical rules, and are confinedneither to immunodominant regions of intact proteins (i.e., immunogenicepitopes) nor to the amino or carboxyl terminals. Peptides that areextremely hydrophobic and those of six or fewer residues generally areineffective at inducing antibodies that bind to the mimicked protein;longer, soluble peptides, especially those containing proline residues,usually are effective. Sutcliffe et al., supra, at 661. For instance, 18of 20 peptides designed according to these guidelines, containing 8-39residues covering 75% of the sequence of the influenza virushemagglutinin HA1 polypeptide chain, induced antibodies that reactedwith the HA1 protein or intact virus; and 12/12 peptides from the MuLVpolymerase and 18/18 from the rabies glycoprotein induced antibodiesthat precipitated the respective proteins.

Antigenic epitope-bearing peptides and polypeptides of the invention aretherefore useful to raise antibodies, including monoclonal antibodies,that bind specifically to a polypeptide of the invention. Thus, a highproportion of hybridomas obtained by fusion of spleen cells from donorsimmunized with an antigen epitope-bearing peptide generally secreteantibody reactive with the native protein. Sutcliffe et al., supra, at663. The antibodies raised by antigenic epitope-bearing peptides orpolypeptides are useful to detect the mimicked protein, and antibodiesto different peptides may be used for tracking the fate of variousregions of a protein precursor which undergoes posttranslationprocessing. The peptides and anti-peptide antibodies may be used in avariety of qualitative or quantitative assays for the mimicked protein,for instance in competition assays since it has been shown that evenshort peptides (e.g., about 9 amino acids) can bind and displace thelarger peptides in immunoprecipitation assays. See, for instance,Wilson, I. A. et al., Cell 37:767-778 at 777 (1984). The anti-peptideantibodies of the invention also are useful for purification of themimicked protein, for instance, by adsorption chromatography usingmethods well known in the art.

Antigenic epitope-bearing peptides and polypeptides of the inventiondesigned according to the above guidelines preferably contain a sequenceof at least seven, more preferably at least nine and most preferablybetween about 15 to about 30 amino acids contained within the amino acidsequence of a polypeptide of the invention. However, peptides orpolypeptides comprising a larger portion of an amino acid sequence of apolypeptide of the invention, containing about 30 to about 50 aminoacids, or any length up to and including the entire amino acid sequenceof a polypeptide of the invention, also are considered epitope-bearingpeptides or polypeptides of the invention and also are useful forinducing antibodies that react with the mimicked protein. Preferably,the amino acid sequence of the epitope-bearing peptide is selected toprovide substantial solubility in aqueous solvents (i.e., the sequenceincludes relatively hydrophilic residues and highly hydrophobicsequences are preferably avoided); and sequences containing prolineresidues are particularly preferred.

The epitope-bearing peptides and polypeptides of the invention may beproduced by any conventional means for making peptides or polypeptidesincluding recombinant means using nucleic acid molecules of theinvention. For instance, a short epitope-bearing amino acid sequence maybe fused to a larger polypeptide which acts as a carrier duringrecombinant production and purification, as well as during immunizationto produce anti-peptide antibodies. Epitope-bearing peptides also may besynthesized using known methods of chemical synthesis. For instance,Houghten has described a simple method for synthesis of large numbers ofpeptides, such as 10-20 mg of 248 different 13 residue peptidesrepresenting single amino acid variants of a segment of the HA1polypeptide which were prepared and characterized (by ELISA-type bindingstudies) in less than four weeks. Houghten, R. A., Proc. Natl. Acad.Sci. USA 82:5131-5135 (1985). This “Simultaneous Multiple PeptideSynthesis (SMPS)” process is further described in U.S. Pat. No.4,631,211 to Houghten et al. (1986). In this procedure the individualresins for the solid-phase synthesis of various peptides are containedin separate solvent-permeable packets, enabling the optimal use of themany identical repetitive steps involved in solid-phase methods. Acompletely manual procedure allows 500-1000 or more syntheses to beconducted simultaneously. Houghten et al., supra, at 5134.

Epitope-bearing peptides and polypeptides of the invention are used toinduce antibodies according to methods well known in the art. See, forinstance, Sutcliffe et al., supra; Wilson et al., supra; Chow, M. etal., Proc. Natl. Acad. Sci. USA 82:910-914; and Bittle, F. J. et al., J.Gen. Virol. 66:2347-2354 (1985). Generally, animals may be immunizedwith free peptide; however, anti-peptide antibody titer may be boostedby coupling of the peptide to a macromolecular carrier, such as keyholelimpet hemacyanin (KLH) or tetanus toxoid. For instance, peptidescontaining cysteine may be coupled to carrier using a linker such asm-maleimidobenzoyl-N-hydroxysuccinimide ester (MBS), while otherpeptides may be coupled to carrier using a more general linking agentsuch as glutaraldehyde. Animals such as rabbits, rats and mice areimmunized with either free or carrier-coupled peptides, for instance, byintraperitoneal and/or intradermal injection of emulsions containingabout 100 μg peptide or carrier protein and Freund's adjuvant. Severalbooster injections may be needed, for instance, at intervals of abouttwo weeks, to provide a useful titer of anti-peptide antibody which canbe detected, for example, by ELISA assay using free peptide adsorbed toa solid surface. The titer of anti-peptide antibodies in serum from animmunized animal may be increased by selection of anti-peptideantibodies, for instance, by adsorption to the peptide on a solidsupport and elution of the selected antibodies according to methods wellknown in the art.

Immunogenic epitope-bearing peptides of the invention, i.e., those partsof a protein that elicit an antibody response when the whole protein isthe immunogen, are identified according to methods known in the art. Forinstance, Geysen et al., 1984, supra, discloses a procedure for rapidconcurrent synthesis on solid supports of hundreds of peptides ofsufficient purity to react in an enzyme-linked immunosorbent assay.Interaction of synthesized peptides with antibodies is then easilydetected without removing them from the support. In this manner apeptide bearing an immunogenic epitope of a desired protein may beidentified routinely by one of ordinary skill in the art. For instance,the immunologically important epitope in the coat protein offoot-and-mouth disease virus was located by Geysen et al. with aresolution of seven amino acids by synthesis of an overlapping set ofall 208 possible hexapeptides covering the entire 213 amino acidsequence of the protein. Then, a complete replacement set of peptides inwhich all 20 amino acids were substituted in turn at every positionwithin the epitope were synthesized, and the particular amino acidsconferring specificity for the reaction with antibody were determined.Thus, peptide analogs of the epitope-bearing peptides of the inventioncan be made routinely by this method. U.S. Pat. No. 4,708,781 to Geysen(1987) further describes this method of identifying a peptide bearing animmunogenic epitope of a desired protein.

Further still, U.S. Pat. No. 5,194,392 to Geysen (1990) describes ageneral method of detecting or determining the sequence of monomers(amino acids or other compounds) which is a topological equivalent ofthe epitope (i.e., a “mimotope”) which is complementary to a particularparatope (antigen binding site) of an antibody of interest. Moregenerally, U.S. Pat. No. 4,433,092 to Geysen (1989) describes a methodof detecting or determining a sequence of monomers which is atopographical equivalent of a ligand which is complementary to theligand binding site of a particular receptor of interest. Similarly,U.S. Pat. No. 5,480,971 to Houghten, R. A. et al. (1996) on PeralkylatedOligopeptide Mixtures discloses linear C₁-C₇-alkyl peralkylatedoligopeptides and sets and libraries of such peptides, as well asmethods for using such oligopeptide sets and libraries for determiningthe sequence of a peralkylated oligopeptide that preferentially binds toan acceptor molecule of interest. Thus, non-peptide analogs of theepitope-bearing peptides of the invention also can be made routinely bythese methods.

The present inventors have discovered that the CD33-like protein is a551 residue protein exhibiting three main structural domains. First, theextracellular domain (which includes the ligand binding domain) wasidentified within residues from about 16 to about 422 in FIGS. 1A-1C(amino acids from about 1 to about 407 in SEQ ID NO:2). The matureextracellular domain has been predicted by the inventors as being about407 amino acids in length with a molecular weight of about 45 kDa.Second, the transmembrane domain was identified within residues fromabout 423 to about 464 in FIGS. 1A-1C (amino acids from about 408 toabout 449 in SEQ ID NO:2). Third, the intracellular domain wasidentified within residues from about 465 to about 551 in FIGS. 1A-1C(amino acids from about 450 to about 536 in SEQ ID NO:2). Thus, theinvention further provides preferred CD33-like protein fragmentscomprising a polypeptide selected from: the mature CD33-like protein;the CD33-like protein extracellular domain; the CD33-like proteintransmembrane domain; the CD33-like protein intracellular domain; or theCD33-like protein extracellular domain and intracellular domain withpart or all of the transmembrane domain deleted. Methods for producingsuch CD33-like protein fragments are described above.

The extracellular domains of receptors can be combined with parts of theconstant domain of immunoglobulins (IgG), resulting in chimericpolypeptides. These fusion proteins often show an increased half-lifetime in vivo. This has been shown, e.g., for chimeric proteinsconsisting of the first two domains of the human CD4 polypeptide andvarious domains of the constant regions of the heavy or light chains ofmammalian immunoglobulins (European Patent Application Publication No.394 827; Traunecker et al., Nature 331: 84-86 (1988)). Fusion proteinsthat have disulfide-linked dimeric structure due to the IgG part canalso be more efficient in binding and neutralizing the ligands than themonomeric extracellular domains alone (Fountoulakis et al., J. Biochem.270: 3958-3964 (1995)).

As described in detail below, the polypeptides of the present inventionand fragments thereof can be used to raise polyclonal and monoclonalantibodies, which are useful in diagnostic assays for detectingCD33-like protein expression as described below or as agonists andantagonists capable of enhancing or inhibiting CD33-like proteinfunction. Further, such polypeptides can be used in the yeast two-hybridsystem to “capture” CD33-like protein binding proteins which are alsocandidate agonist and antagonist according to the present invention. Theyeast two hybrid system is described in Fields and Song, Nature340:245-246 (1989).

The entire disclosure of each document cited in this section on“CD33-Like Polypeptides and Fragments” is hereby incorporated herein byreference.

Cancer and Inflammatory Disease Diagnosis and Prognosis

It is believed that certain tissues in mammals with cancer orinflammatory disease contain significantly greater CD33-like proteingene copy number and express significantly enhanced levels of theCD33-like protein and mRNA encoding the CD33-like protein when comparedto a corresponding “standard” mammal, i.e., a mammal of the same speciesnot having the cancer or inflammatory disease. Enhanced levels of theCD33-like protein will be detected in certain body fluids (e.g., sera,plasma, urine, synovial and spinal fluid) from mammals with cancer orinflammatory disease when compared to sera from mammals of the samespecies not having the cancer or inflammatory disease. Thus, theinvention provides a method useful during tumor or inflammatory diseasediagnosis, which involves assaying the expression level of the geneencoding the CD33-like protein or the gene copy number in mammaliancells or body fluid and comparing the gene expression level or gene copynumber with a standard CD33-like protein gene expression level or genecopy number, whereby an increase in the gene expression level or genecopy number over the standard is indicative of certain tumors orinflammatory disease.

Where a tumor or inflammatory disease diagnosis has already been madeaccording to conventional methods, the present invention is useful as aprognostic indicator. For example, samples of bone marrow or peripheralblood can be obtained from patients diagnosed previously with leukemiato obtain leukemic blasts for examination of the CD33-like proteinexpression using anti-CD33-like protein monoclonal antibodies. Becausethe level of differentiation of normal myeloid cells is believed to bereflected by the concentration of the CD33-like protein antigenexpressed, samples can be categorized as CD33-bright (immature) versusCD33-dull (mature). Patients whose leukemic blasts display the CD33-likeprotein antigen in an amount associated with immature myeloid cells willexperience a worse clinical outcome than patients with leukemic blastsexpressing a phenotype associated with more mature cells (i.e., arelatively lower CD33-like protein expression level).

By “assaying the expression level of the gene encoding the CD33-likeprotein” is intended qualitatively or quantitatively measuring orestimating the level of the CD33-like protein or the level of the mRNAencoding the CD33-like protein in a first biological sample eitherdirectly (e.g., by determining or estimating absolute protein level ormRNA level) or relatively (e.g., by comparing to the CD33-like proteinlevel or mRNA level in a second biological sample). By “assaying thecopy number of the gene encoding the CD33-like protein” is intendedqualitatively or quantitatively measuring or estimating the gene copynumber in a first biological sample either directly (e.g., bydetermining or estimating absolute gene copy number) or relatively(e.g., by comparing to the CD33-like protein gene copy number in asecond biological sample).

Preferably, the CD33-like protein level, mRNA level, or gene copy numberin the first biological sample is measured or estimated and compared toa standard CD33-like protein level, mRNA level, or gene copy number, thestandard being taken from a second biological sample obtained from anindividual not having the cancer or inflammatory disease. Alternatively,where the method is used as a prognostic indicator, both the first andsecond biological samples can be taken from individuals having thecancer or inflammatory disease and the relative expression levels orcopy number will be measured to determine prognosis. As will beappreciated in the art, once a standard CD33-like protein level, mRNAlevel, or gene copy number is known, it can be used repeatedly as astandard for comparison.

By “biological sample” is intended any biological sample obtained froman individual, cell line, tissue culture, or other source which containsCD33-like protein or mRNA. Biological samples include mammalian bodyfluids (such as sera, plasma, urine, synovial fluid and spinal fluid)which contain secreted mature CD33-like protein or its solubleextracellular domain, and eosinophils, spleen tissue, monocytes,neutrophils, tonsils, and bone marrow. Methods for obtaining tissuebiopsies and body fluids from mammals are well known in the art. Wherethe biological sample is to include mRNA, a tissue biopsy is thepreferred source.

The present invention is useful for detecting cancer and inflammatorydisease in mammals. In particular the invention is useful duringdiagnosis or prognosis of the following types of cancers andinflammatory diseases in mammals: metastatic tumors, leukemias,lymphomas, arthritis, and allergical diseases. Preferred mammals includemonkeys, apes, cats, dogs, cows, pigs, horses, rabbits and humans.Particularly preferred are humans.

Total cellular RNA can be isolated from a biological sample using anysuitable technique such as the single-stepguanidinium-thiocyanate-phenol-chloroform method described inChomczynski and Sacchi, Anal. Biochem. 162:156-159 (1987). Levels ofmRNA encoding the CD33-like protein are then assayed using anyappropriate method. These include Northern blot analysis, S1 nucleasemapping, the polymerase chain reaction (PCR), reverse transcription incombination with the polymerase chain reaction (RT-PCR), and reversetranscription in combination with the ligase chain reaction (RT-LCR).

Northern blot analysis can be performed as described in Harada et al.,Cell 63:303-312 (1990). Briefly, total RNA is prepared from a biologicalsample as described above. For the Northern blot, the RNA is denaturedin an appropriate buffer (such as glyoxal/dimethyl sulfoxide/sodiumphosphate buffer), subjected to agarose gel electrophoresis, andtransferred onto a nitrocellulose filter. After the RNAs have beenlinked to the filter by a UV linker, the filter is prehybridized in asolution containing formamide, SSC, Denhardt's solution, denaturedsalmon sperm, SDS, and sodium phosphate buffer. CD33-like protein cDNAlabeled according to any appropriate method (such as the ³²P-multiprimedDNA labeling system (Amersham)) is used as probe. After hybridizationovernight, the filter is washed and exposed to x-ray film. cDNA for useas probe according to the present invention is described in the sectionsabove and will preferably be at least 15 bp in length.

S1 mapping can be performed as described in Fujita et al., Cell49:357-367 (1987). To prepare probe DNA for use in S1 mapping, the sensestrand of above-described cDNA is used as a template to synthesizelabeled antisense DNA. The antisense DNA can then be digested using anappropriate restriction endonuclease to generate further DNA probes of adesired length. Such antisense probes are useful for visualizingprotected bands corresponding to the target mRNA (i.e., mRNA encodingthe CD33-like protein). Northern blot analysis can be performed asdescribed above.

Preferably, levels of mRNA encoding the CD33-like protein are assayedusing the RT-PCR method described in Makino et al., Technique 2:295-301(1990). By this method, the radioactivities of the “amplicons” in thepolyacrylamide gel bands are linearly related to the initialconcentration of the target mRNA. Briefly, this method involves addingtotal RNA isolated from a biological sample in a reaction mixturecontaining a RT primer and appropriate buffer. After incubating forprimer annealing, the mixture can be supplemented with a RT buffer,dNTPs, DTT, RNase inhibitor and reverse transcriptase. After incubationto achieve reverse transcription of the RNA, the RT products are thensubject to PCR using labeled primers. Alternatively, rather thanlabeling the primers, a labeled dNTP can be included in the PCR reactionmixture. PCR amplification can be performed in a DNA thermal cycleraccording to conventional techniques. After a suitable number of roundsto achieve amplification, the PCR reaction mixture is electrophoresed ona polyacrylamide gel. After drying the gel, the radioactivity of theappropriate bands (corresponding to the mRNA encoding the CD33-likeprotein)) is quantified using an imaging analyzer. RT and PCR reactioningredients and conditions, reagent and gel concentrations, and labelingmethods are well known in the art. Variations on the RT-PCR method willbe apparent to the skilled artisan.

Any set of oligonucleotide primers which will amplify reversetranscribed target mRNA can be used and can be designed as described inthe sections above.

Assaying CD33-like protein gene copy number can occur according to anyknown technique, such as for example, in situ hybridization of tissuesamples with a cDNA probe described above.

Assaying CD33-like protein levels in a biological sample can occur usingany art-known method. Preferred for assaying CD33-like protein levels ina biological sample are antibody-based techniques. For example,CD33-like protein expression in tissues can be studied with classicalimmunohistological methods. In these, the specific recognition isprovided by the primary antibody (polyclonal or monoclonal) but thesecondary detection system can utilize fluorescent, enzyme, or otherconjugated secondary antibodies. As a result, an immunohistologicalstaining of tissue section for pathological examination is obtained.Tissues can also be extracted, e.g., with urea and neutral detergent,for the liberation of CD33-like protein for Western-blot or dot/slotassay (Jalkanen, M., et al., J. Cell. Biol. 101:976-985 (1985));Jalkanen, M., et al., J. Cell. Biol. 105:3087-3096 (1987)). In thistechnique, which is based on the use of cationic solid phases,quantitation of CD33-like protein can be accomplished using isolatedCD33-like protein as a standard. This technique can also be applied tobody fluids. With these samples, a molar concentration of CD33-likeprotein will aid to set standard values of CD33-like protein content fordifferent body fluids, like serum, plasma, urine, spinal fluid, etc. Thenormal appearance of CD33-like protein amounts can then be set usingvalues from healthy individuals, which can be compared to those obtainedfrom a test subject.

Other antibody-based methods useful for detecting CD33-like protein geneexpression include immunoassays, such as the enzyme linked immunosorbentassay (ELISA) and the radioimmunoassay (RIA). For example, a CD33-likeprotein-specific monoclonal antibody can be used both as animmunoabsorbent and as an enzyme-labeled probe to detect and quantifythe CD33-like protein. The amount of CD33-like protein present in thesample can be calculated by reference to the amount present in astandard preparation using a linear regression computer algorithm. Suchan ELISA for detecting a tumor antigen is described in Iacobelli et al.,Breast Cancer Research and Treatment 11:19-30 (1988). In another ELISAassay, two distinct specific monoclonal antibodies can be used to detectCD33-like protein in a body fluid. In this assay, one of the antibodiesis used as the immunoabsorbent and the other as the enzyme-labeledprobe.

The above techniques may be conducted essentially as a “one-step” or“two-step” assay. The “one-step” assay involves contacting CD33-likeprotein with immobilized antibody and, without washing, contacting themixture with the labeled antibody. The “two-step” assay involves washingbefore contacting the mixture with the labeled antibody. Otherconventional methods may also be employed as suitable. It is usuallydesirable to immobilize one component of the assay system on a support,thereby allowing other components of the system to be brought intocontact with the component and readily removed from the sample.

Suitable enzyme labels include, for example, those from the oxidasegroup, which catalyze the production of hydrogen peroxide by reactingwith substrate. Glucose oxidase is particularly preferred as it has goodstability and its substrate (glucose) is readily available. Activity ofan oxidase label may be assayed by measuring the concentration ofhydrogen peroxide formed by the enzyme-labeled antibody/substratereaction. Besides enzymes, other suitable labels include radioisotopes,such as iodine (¹²⁵I, ¹²¹I), carbon (¹⁴C), sulphur (³⁵S), tritium (³H,indium (¹¹²In), and technetium (^(99m)Tc), and fluorescent labels, suchas fluorescein and rhodamine, and biotin.

In addition to assaying CD33-like protein levels in a biological sampleobtained from an individual, CD33-like protein can also be detected invivo by imaging. Antibody labels or markers for in vivo imaging ofCD33-like protein include those detectable by X-radiography, NMR or ESR.For X-radiography, suitable labels include radioisotopes such as bariumor caesium, which emit detectable radiation but are not overtly harmfulto the subject. Suitable markers for NMR and ESR include those with adetectable characteristic spin, such as deuterium, which may beincorporated into the antibody by labeling of nutrients for the relevanthybridoma.

A CD33-like protein-specific antibody or antibody fragment which hasbeen labeled with an appropriate detectable imaging moiety, such as aradioisotope (for example, ¹³¹I, ¹¹²In, ^(99m)Tc), a radio-opaquesubstance, or a material detectable by nuclear magnetic resonance, isintroduced (for example, parenterally, subcutaneously orintraperitoneally) into the mammal to be examined for cancer. It will beunderstood in the art that the size of the subject and the imagingsystem used will determine the quantity of imaging moiety needed toproduce diagnostic images. In the case of a radioisotope moiety, for ahuman subject, the quantity of radioactivity injected will normallyrange from about 5 to 20 millicuries of ^(99m)Tc. The labeled antibodyor antibody fragment will then preferentially accumulate at the locationof cells which contain CD33-like protein. In vivo tumor imaging isdescribed in S. W. Burchiel et al., “Immunopharmacokinetics ofRadiolabelled Antibodies and Their Fragments” (Chapter 13 in TumorImaging: The Radiochemical Detection of Cancer, eds., S. W. Burchiel andB. A. Rhodes, Masson Publishing Inc. (1982)).

CD33-like-protein specific antibodies for use in the present inventioncan be raised against the intact CD33-like protein or an antigenicpolypeptide fragment thereof, which may presented together with acarrier protein, such as an albumin, to an animal system (such as rabbitor mouse) or, if it is long enough (at least about 25 amino acids),without a carrier.

As used herein, the term “antibody” (Ab) or “monoclonal antibody” (MoAb)is meant to include intact molecules as well as antibody fragments (suchas, for example, Fab and F(ab′)₂ fragments) which are capable ofspecifically binding to CD33-like protein. Fab and F(ab′)₂ fragmentslack the Fc fragment of intact antibody, clear more rapidly from thecirculation, and may have less non-specific tissue binding of an intactantibody (Wahl et al., J. Nucl. Med. 24:316-325 (1983)). Thus, thesefragments are preferred.

The antibodies of the present invention may be prepared by any of avariety of methods. For example, cells expressing the CD33-like proteinor an antigenic fragment thereof can be administered to an animal inorder to induce the production of sera containing polyclonal antibodies.In a preferred method, a preparation of CD33-like protein is preparedand purified to render it substantially free of natural contaminants.Such a preparation is then introduced into an animal in order to producepolyclonal antisera of greater specific activity.

In the most preferred method, the antibodies of the present inventionare monoclonal antibodies (or CD33-like protein binding fragmentsthereof). Such monoclonal antibodies can be prepared using hybridomatechnology (Kohler et al., Nature 256:495 (1975); Kohler et al., Eur. J.Immunol. 6:511 (1976); Kohler et al., Eur. J. Immunol. 6:292 (1976);Hammerling et al., In: Monoclonal Antibodies and T-Cell Hybridomas,Elsevier, N.Y., pp. 563-681 (1981)). In general, such procedures involveimmunizing an animal (preferably a mouse) with a CD33-like proteinantigen or, more preferably, with a CD33-like protein-expressing cell.Suitable cells can be recognized by their capacity to bindanti-CD33-like protein antibody. Such cells may be cultured in anysuitable tissue culture medium; however, it is preferable to culturecells in Earle's modified Eagle's medium supplemented with 10% fetalbovine serum (inactivated at about 56° C.), and supplemented with about10 μg/l of nonessential amino acids, about 1,000 U/ml of penicillin, andabout 100 μg/ml of streptomycin. The splenocytes of such mice areextracted and fused with a suitable myeloma cell line. Any suitablemyeloma cell line may be employed in accordance with the presentinvention; however, it is preferable to employ the parent myeloma cellline (SP₂O), available from the American Type Culture Collection,Manassas, Va. After fusion, the resulting hybridoma cells areselectively maintained in HAT medium, and then cloned by limitingdilution as described by Wands et al. (Gastioenterology 80:225-232(1981)). The hybridoma cells obtained through such a selection are thenassayed to identify clones which secrete antibodies capable of bindingthe CD33-like protein antigen.

Alternatively, additional antibodies capable of binding to the CD33-likeprotein antigen may be produced in a two-step procedure through the useof anti-idiotypic antibodies. Such a method makes use of the fact thatantibodies are themselves antigens, and that, therefore, it is possibleto obtain an antibody which binds to a second antibody. In accordancewith this method, CD33-like-protein specific antibodies are used toimmunize an animal, preferably a mouse. The splenocytes of such ananimal are then used to produce hybridoma cells, and the hybridoma cellsare screened to identify clones which produce an antibody whose abilityto bind to the CD33-like protein-specific antibody can be blocked by theCD33-like protein antigen. Such antibodies comprise anti-idiotypicantibodies to the CD33-like protein-specific antibody and can be used toimmunize an animal to induce formation of further CD33-likeprotein-specific antibodies.

It will be appreciated that Fab and F(ab′)₂ and other fragments of theantibodies of the present invention may be used according to the methodsdisclosed herein. Such fragments are typically produced by proteolyticcleavage, using enzymes such as papain (to produce Fab fragments) orpepsin (to produce F(ab′)₂ fragments). Alternatively, CD33-likeprotein-binding fragments can be produced through the application ofrecombinant DNA technology or through synthetic chemistry.

Where in vivo imaging is used to detect enhanced levels of CD33-likeprotein for tumor diagnosis in humans, it may be preferable to use“humanized” chimeric monoclonal antibodies. Such antibodies can beproduced using genetic constructs derived from hybridoma cells producingthe monoclonal antibodies described above. Methods for producingchimeric antibodies are known in the art. See, for review, Morrison,Science 229:1202 (1985); Oi et al., BioTechniques 4:214 (1986); Cabillyet al., U.S. Pat. No. 4,816,567; Taniguchi et al., EP 171496; Morrisonet al., EP 173494; Neuberger et al., WO 8601533; Robinson et al., WO8702671; Boulianne et al., Nature 312:643 (1984); Neuberger et al.,Nature 314:268 (1985).

Further suitable labels for the CD33-like protein-specific antibodies ofthe present invention are provided below. Examples of suitable enzymelabels include malate dehydrogenase, staphylococcal nuclease,delta-5-steroid isomerase, yeast-alcohol dehydrogenase, alpha-glycerolphosphate dehydrogenase, triose phosphate isomerase, peroxidase,alkaline phosphatase, asparaginase, glucose oxidase, beta-galactosidase,ribonuclease, urease, catalase, glucose-6-phosphate dehydrogenase,glucoamylase, and acetylcholine esterase.

Examples of suitable radioisotopic labels include ³H, ¹¹¹In, ¹²⁵I, ¹³¹I,³²P, ³⁵S, ¹⁴C, ⁵¹Cr, ⁵⁷To, ⁵⁸Co, ⁵⁹Fe, ⁷⁵Se, ¹⁵²Eu, ⁹⁰Y, ⁶⁷Cu, ²¹⁷Ci,²¹¹At, ²¹²Pb, ⁴⁷Sc, ¹⁰⁹Pd, etc. ¹¹¹In is a preferred isotope where invivo imaging is used since its avoids the problem of dehalogenation ofthe ¹²¹I or ¹³¹I-labeled monoclonal antibody by the liver. In addition,this radionucleotide has a more favorable gamma emission energy forimaging (Perkins et al., Eur. J. Nucl. Med. 10:296-301 (1985);Carasquillo et al., J. Nucl. Med. 28:281-287 (1987)). For example, ¹¹¹Incoupled to monoclonal antibodies with 1-(P-isothiocyanatobenzyl)-DPTAhas shown little uptake in non-tumorous tissues, particularly the liver,and therefore enhances specificity of tumor localization (Esteban etal., J. Nucl. Med. 28:861-870 (1987)).

Examples of suitable non-radioactive isotopic labels include ¹⁵⁷Gd,⁵⁵Mn, ¹⁶²Dy, ⁵²Tr, and ⁵⁶Fe.

Examples of suitable fluorescent labels include an ¹⁵²Eu label, afluorescein label, an isothiocyanate label, a rhodamine label, aphycoerythrin label, a phycocyanin label, an allophycocyanin label, ano-phthaldehyde label, and a fluorescamine label.

Examples of suitable toxin labels include diphtheria toxin, ricin, andcholera toxin.

Examples of chemiluminescent labels include a luminal label, anisoluminol label, an aromatic acridinium ester label, an imidazolelabel, an acridinium salt label, an oxalate ester label, a luciferinlabel, a luciferase label, and an aequorin label.

Examples of nuclear magnetic resonance contrasting agents include heavymetal nuclei such as Gd, Mn, and iron.

Typical techniques for binding the above-described labels to antibodiesare provided by Kennedy et al. (Clin. Chim. Acta 70:1-31 (1976)), andSchurs et al. (Clin. Chim. Acta 81:1-40 (1977)). Coupling techniquesmentioned in the latter are the glutaraldehyde method, the periodatemethod, the dimaleimide method, them-maleimido-benzyl-N-hydroxy-succinimide ester method, all of whichmethods are incorporated by reference herein.

Therapeutics

CD33 is expressed by clonogenic leukemic cells from about 90% ofpatients with acute myeloid leukemia (AML). While about 60-70% of adultssuffering from AML experience complete remission after chemotherapy,most of these patients will ultimately die of relapsed leukemia(Robertson et al., Blood 79:2229-2236 (1992)). It is believed that, likeCD33, the CD33-like protein of the present invention is also expressedby clonogenic leukemic cells from the vast majority of patients withAML.

Postremission therapy is more successful when patients are treated withan allogeneic bone marrow transplant. However, this mode of therapy isoften unavailable to an individual because of their age or because ofthe lack of an appropriate donor. An alternative treatment utilizesautologous bone marrow harvested while the patient was in remission.However, if residual leukemia cells exist, such an allograft couldresult in relapse for the patient. Consequently, a need exists formethods of purging leukemic cells from the autografts of patients withadvanced AML. Such a method would ideally involve the use of a cytotoxicagent capable of selectively eliminating or removing tumor cells whilesparing the hematopoietic stem cells necessary for engraftment. Studieshave shown that the majority of leukemia cells are incapable ofsustained replication; these cells are, however, produced by a smallnumber of leukemic “stem cells” which appear to have a different surfaceantigen phenotype from the other cells, i.e., they are believed to lackthe CD33-like protein antigen of the present invention.

By the invention, a method is provided for purging leukemichematopoietic cells from the autografts of patients with AML. The methodinvolves obtaining bone marrow (BM) from an AML patient by, for example,percutaneous aspirations from the posterior iliac crest, isolating BMmononuclear by Ficoll-hypaque density gradient centrifugation, andincubating with an anti-CD33-like protein monoclonal antibody (MoAb),for example, 3-5 times for 15-30 minutes at 4-6° C., followed byincubation with rabbit complement at about 37° C. for 30 minutes.(Rabbit complement tested for optimal specific cytotoxicity can beobtained as described in Roy et al., Leuk Res 14:407 (1990)). Thepatient is then subject to myeloablative chemotherapy as described inRobertson et al., Blood 79 (9):2229 (1992), followed by reinfusion ofthe treated autologous BM according to standard technique. By theinvention, clonogenic tumor cells are depleted from the marrow whilesparing hematopoietic cells necessary for engraftment.

In a further embodiment, the invention provides an in vivo method forselectively killing or inhibiting growth of tumor cells expressing theCD33-like protein antigen of the present invention. Such tumor cellsinclude metastatic tumors, leukemias and lymphomas. The method involvesadministering to a patient an effective amount of an antagonist toinhibit the CD33-like protein receptor signaling pathway. By theinvention, administering such antagonist of the CD33-like protein to apatient is also useful for treating inflammatory diseases includingarthritis and colitis.

Antagonists for use in the present invention include polyclonal andmonoclonal antibodies raised against the CD33-like polypeptides or afragment thereof. Such antagonist antibodies raised against CD33-likepolypeptides can be generated as described in Caron et al., CancerResearch 52:6761 (1992); Juric et al., Cancer Research (Suppl.)55:5908s-5910s (1995); and Robertson et al., Blood 79:2229-2236 (1992).

Other potential antagonists include antisense molecules. Antisensetechnology can be used to control gene expression through antisense DNAor RNA or through triple-helix formation. Antisense techniques arediscussed, for example, in Okano, J., Neurochem. 56:560 (1991);Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRCPress, Boca Raton, Fla. (1988). Triple helix formation is discussed in,for instance Lee et al., Nucleic Acids Research 6:3073 (1979); Cooney etal., Science 241:456 (1988); and Dervan et al., Science 251:1360 (1991).The methods are based on binding of a polynucleotide to a complementaryDNA or RNA. For example, the 5′ coding portion of a polynucleotide thatencodes the mature polypeptide of the present invention may be used todesign an antisense RNA oligonucleotide of from about 10 to 40 basepairs in length. A DNA oligonucleotide is designed to be complementaryto a region of the gene involved in transcription thereby preventingtranscription and the production of the receptor. The antisense RNAoligonucleotide hybridizes to the mRNA in vivo and blocks translation ofthe mRNA molecule into receptor polypeptide. The oligonucleotidesdescribed above can also be delivered to cells such that the antisenseRNA or DNA may be expressed in vivo to inhibit production of thereceptor.

Further antagonist according to the present invention include solubleforms of CD33-like protein, i.e., CD33-like protein fragments thatinclude the extracellular region from the full length receptor. Suchsoluble forms of the receptor, which may be naturally occurring orsynthetic, antagonize CD33-like protein mediated signaling by competingwith the cell surface CD33-like protein for binding to CD33 receptorligands.

As indicated polyclonal and monoclonal antibody antagonists according tothe present invention can be raised according to the methods disclosedin WO 93/20848 and U.S. Pat. No. 5,239,062. The term “antibody” (Ab) or“monoclonal antibody” (MoAb) as used herein is meant to include intactmolecules as well as fragments thereof (such as, for example, Fab andF(ab′)₂ fragments) which are capable of binding an antigen. Fab andF(ab′)₂ fragments lack the Fc fragment of intact antibody, clear morerapidly from the circulation, and may have less non-specific tissuebinding of an intact antibody (Wahl et al., J. Nucl. Med. 24:316-325(1983)).

Antibodies according to the present invention may be prepared by any ofa variety of methods using CD33-like protein immunogens of the presentinvention. As indicated, such CD33-like protein immunogens include thefull length CD33-like protein polypeptide (which may or may not includethe leader sequence) and CD33-like protein polypeptide fragments such asthe extracellular domain, the transmembrane domain, and theintracellular domain.

In a preferred method, antibodies according to the present invention areMoAbs. Such MoAbs can be prepared using hybridoma technology asdescribed above (Kohler and Millstein, Nature 256:495-497 (1975) andU.S. Pat. No. 4,376,110; Harlow et al., Antibodies: A Laboratory Manual,Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1988;Monoclonal Antibodies and Hybridomas: A New Dimension in BiologicalAnalyses, Plenum Press, New York, N.Y., 1980; Campbell, “MonoclonalAntibody Technology,” In: Laboratory Techniques in Biochemistry andMolecular Biology, Volume 13 (Burdon et al., eds.), Elsevier, Amsterdam(1984)).

Also intended within the scope of the present invention are humanizedchimeric antibodies, produced using genetic constructs derived fromhybridoma cells producing the MoAbs described above. Methods forproduction of chimeric antibodies are known in the art. See, for review:Morrison, Science, 229:1202-1207 (1985); Oi et al., BioTechniques 4:214(1986); see, also: Cabilly et al., U.S. Pat. No. 4,816,567 (Mar. 28,1989); Taniguchi et al., EPO Patent Public. EP171496 (Feb. 19, 1986);Morrison et al., EPO Patent Pub. EP173494 (Mar. 5, 1986); Neuberger etal., PCT Pub. WO8601533 (Mar. 13, 1986); Robinson et al., PCT Pub. WO8702671 (May 7, 1987); Boulianne et al., Nature 312:643-646 (1984);Neuberger et al., Nature 314:268-270 (1985).

A particularly preferred method for generating an anti-CD33-like proteinhumanized MoAb is described in Caron et al., Cancer Res 52:6761 (1992).

Proteins and other compounds which bind the CD33-like protein domainsare also candidate antagonist according to the present invention. Suchbinding compounds can be “captured” using the yeast two-hybrid system(Fields and Song, Nature 340:245-246 (1989)). A modified version of theyeast two-hybrid system has been described by Roger Brent and hiscolleagues (Gyuris, J. et al., Cell 75:791-803 (1993); Zervos, A. S. etal., Cell 72:223-232 (1993)). Briefly, a domain of the CD33-likepolypeptide is used as bait for binding compounds. Positives are thenselected by their ability to grow on plates lacking leucine, and thenfurther tested for their ability to turn blue on plates with X-gal, aspreviously described in great detail (Gyuris, J. et al., Cell 75:791-803(1993)). Preferably, the yeast two-hybrid system is used according tothe present invention to capture compounds which bind to either theCD33-like extracellular ligand binding domain or to the CD33-likeprotein intracellular domain. Such compounds are good candidateantagonists of the present invention. This system has been usedpreviously to isolate proteins which bind to the intracellular domain ofthe p55 and p75 TNF receptors (WO 95/31544).

The invention further provides methods for using the CD33-like proteinbinding compounds described above as vehicles for selective killing oftumor cells.

The specificity of the binding compounds to the CD33-like protein can bedetermined by their affinity. Such specificity exists if thedissociation constant (K_(D)=1/K, where K is the affinity constant) ofthe moiety is <1 μM, preferably <100 nM, and most preferably <1 nM.Antibody molecules will typically have a K_(D) in the lower ranges.K_(D)=[R-L]/[R] [L] where [R], [L], and [R-L] are the concentrations atequilibrium of the receptor or CD33-like protein (R), ligand, antibody,or peptide (L) and receptor-ligand complex (R-L), respectively.Typically, the binding interactions between ligand or peptide andreceptor or antigen include reversible noncovalent associations such aselectrostatic attraction, Van der Waals forces, and hydrogen bonds.

Other assay formats may involve the detection of the presence or absenceof various physiological or chemical changes that result from theinteraction, such as down modulation, internalization, or an increase inphosphorylation, as described in Receptor-Effector Coupling—A PracticalApproach, ed. Hulme, IRL Press, Oxford (1990).

Gelonin is a glycoprotein (m.w. approximately 29-30,000 Kd) purifiedfrom the seeds of Gelonium multiforum. Gelonin belongs to a class ofpotent ribosomal-inactivating plant toxins. Other members of this classof ribosomal inactivating plant toxins are the chains of abrin, ricinand modeccin. Gelonin, like abrin and ricin, inhibits protein synthesisby damaging the 60S sub-unit of mammalian ribosomes. Gelonin appears tobe stable to chemical and physical treatment. Furthermore, geloninitself does not bind to cells and, therefore, is non-toxic (except inhigh concentrations) and is safe to manipulate in the laboratory. Theinactivation of ribosomes is irreversible, does not appear to involveco-factors and occurs with an efficiency which suggests that geloninacts enzymatically.

Gelonin and ricin are among the most active toxins which inhibit proteinsynthesis on a protein weight basis. Gelonin is 10 to 1000 times moreactive in inhibiting protein synthesis than ricin A chain. Peptides likericin and abrin are composed of two chains, an A chain which is thetoxic unit and a B chain which acts by binding to cells. Unlike ricinand abrin, gelonin is composed of a single chain, and because it lacks aB chain for binding to cells, it is itself relatively non-toxic tointact cells.

Mammalian cells apparently lack the ability to bind and/or tointernalize the native gelonin molecule. Conjugates of gelonin with aCD33-like protein binding compound of the present invention provide botha specific method for binding the gelonin to the cell and a route forinternalization of the gelonin-binding compound complex.

Where the CD33-like protein binding compound is a MoAb, the cytotoxicmoiety of the immunotoxin may be a cytotoxic drug or an enzymaticallyactive toxin of bacterial or plant origin, or an enzymatically activefragment (“A chain”) of such a toxin. Enzymatically active toxins andfragments thereof are preferred and are exemplified by gelonin,diphtheria A chain, nonbinding active fragments of diphtheria toxin,exotoxin A chain (from Pseudomonas aeruginosa), ricin A chain, abrin Achain, modeccin A chain, alpha sarcin, Aleurites fordii proteins,dianthin proteins, Phytoiacca americana proteins (PAPI, PAPII, andPAP-S), momordica charantia inhibitor, curcin, crotin, saponariaofficinalis inhibitor, mitogellin, restrictocin, phenomycin, andenomycin. Most preferred is the conjugation with gelonin.

Biological response modifiers which may be coupled to the anti-CD33-likeprotein MoAb to form an immunotoxin include, but are not limited to,lymphokines and cytokines such as IL-1, IL-2, interferons (alpha, betaor gamma), TNF, LT, TGF-beta and IL-6. These biological responsemodifiers have a variety of effects on tumor cells. Among these areincreased tumor cell killing by direct action as well as increased tumorcell killing by increased host defense mediated processes. Conjugationof an MoAb to these biological response modifiers will allow selectivelocalization within tumors and, hence, improved anti-proliferativeeffects while suppressing non-specific effects leading to toxicity ofnon-target cells.

Cytotoxic drugs (and derivatives thereof) which are useful in thepresent invention include, but are not limited to, adriamycin,cis-platinum complex, bleomycin and methotrexate. These cytotoxic drugsare useful for clinical management of recurrent tumors, but their use iscomplicated by severe side effects and damage caused to non-targetcells. The MoAb serves as a useful carrier of such drugs providing anefficient means of both delivery to the tumor and enhanced entry intothe tumor cells themselves. In addition, specified antibody delivery ofcytotoxic drugs to tumors will provide protection of sensitive sitessuch as the liver that do not express CD33-like protein and bone marrowstem cells from the deleterious action of the chemotherapeutic agents.Use of drugs conjugated to the carrier antibody as a delivery systemallows lower dosage of the drug itself, since all drug moieties areconjugated to antibodies which concentrate within the tumor or leukemia.

Conjugates of the monoclonal antibody may be made using a variety ofbifunctional protein coupling agents. Examples of such reagents areSPDP, iminothiolante (IT), bifunctional derivatives of imidoesters suchas dimethyl adipimidate, HCl, active esters such as disuccinimidylsuberate, aldehydes such as glutaraldehyde, bis-azido compounds such asbib(p-azidobenzoyl) hexanediamine, bis-diazonium derivatives such asbi-(p-diazoniumbenzoyl)ethylenediamine, diisocyanates such as tolyene2,6-diisocyanate, and bis-active fluorine compounds such as a1,5-difluoro-2,4-dinitrobenzene.

When used in vivo for therapy, the immunotoxins are administered to thehuman or animal patient in therapeutically effective amounts, i.e.,amounts that eliminate or reduce the tumor burden or in amounts toeliminate residual disease after an earlier treatment with chemotherapyor radiation therapy. They will normally be administered parenterally,preferably intravenously. The dose and dosage regimen will depend uponthe nature of the leukemia and its population, the characteristics ofthe particular immunotoxin, e.g., its therapeutic index, the patient,and the patient's history. The amount of immunotoxin administered willtypically be in the range of about 0.01 to about 1.0 mg/kg of patientweight.

For parenteral administration the immunotoxins will be formulated in aunit dosage injectable form (solution, suspension, emulsion) inassociation with a pharmaceutically acceptable parenteral vehicle. Suchvehicles are inherently non-toxic and non-therapeutic. Examples of suchvehicles are water, saline, Ringer's solution, dextrose solution, and 5%human serum albumin. Nonaqueous vehicles such as fixed oils and ethyloleate may also be used. Liposomes may be used as carriers. The vehiclemay contain minor amounts of additives such as substances that enhanceisotoxicity and chemical stability, e.g., buffers and preservatives. Theimmunotoxin will typically be formulated in such vehicles atconcentrations of about 0.1 mg/ml to 10 mg/ml.

The immunotoxins of the present invention may also be used in a methodof killing tumor cells in bone marrow. In this method, the bone marrowis first removed from an individual having a neoplastic disease such asleukemia. Subsequently, the bone marrow is treated with a cytocidallyeffective dose of an immunotoxin of the present invention.

Having generally described the invention, the same will be more readilyunderstood by reference to the following examples, which are provided byway of illustration and are not intended as limiting.

EXAMPLES Example 1 Expression in E. coli

The DNA sequence encoding the CD33-like protein in the depositedpolynucleotide is amplified using PCR oligonucleotide primers specificto the amino acid carboxyl terminal sequence of the CD33-like proteinand to vector sequences 3′ to the gene. Additional nucleotidescontaining restriction sites to facilitate cloning are added to the 5′and 3′ sequences respectively.

The 5′ oligonucleotide primer has the sequence:5′CGCCCATGGAGAAGCCAGTGTACGAG 3′ (SEQ ID NO:4) containing the underlinedNcoI restriction site, which encodes a start AUG within the NcoI site,followed by 17 nucleotides of the CD33-like protein cDNA having thenucleotide sequence shown at nucleotide position 85-102 in FIGS. 1A-1C(SEQ ID-NO:1).

The 3′ primer has the sequence: 5′CGCAAGCTTTCAAGAGCCGCTCTGGGACCC 3′ (SEQID NO:5) containing the underlined HindIII restriction site and 18nucleotides of the CD33-like protein cDNA having a nucleotide sequencecomplementary to the nucleotide sequence shown at nucleotide position1285-1302 in FIGS. 1A-1C (SEQ ID NO:1).

The restrictions sites are convenient to restriction enzyme sites in thebacterial expression vector pQE-60, which is used for bacterialexpression in these examples. (Qiagen, Inc. 9259 Eton Avenue,Chatsworth, Calif., 91311). pQE60 encodes ampicillin antibioticresistance (“Ampr”) and contains a bacterial origin of replication(“ori”), an IPTG inducible promoter, a ribosome binding site (“RBS”), a6-His tag and restriction enzyme sites.

The amplified DNA and the vector pQE60 both are digested with NcoI andHindIII and then ligated together. Inserting amplified cDNA encoding theCD33-like protein into pQE60 places the coding region downstream of andoperably linked to the vector's IPTG-inducible promoter, and in-framewith an initiating AUG appropriately positioned for translation of theCD33-like protein.

The ligation mixture is transformed into competent E. coli cells usingstandard procedures. Such procedures are described in Sambrook et al.,Molecular Cloning: A Laboratory Manual, 2nd Ed.; Cold Spring HarborLaboratory Press, Cold Spring Harbor, N.Y. (1989). E. coli strainM15/rep4, containing multiple copies of the plasmid pREP4, whichexpresses lac repressor and confers kanamycin resistance (“Kanr”), isused in carrying out the illustrative example described here. Thisstrain, which is only one of many that are suitable for expressing theCD33-like protein, is available commercially from Qiagen.

Transformants are identified by their ability to grow on LB plates inthe presence of ampicillin. Plasmid DNA is isolated from resistantcolonies and the identity of the cloned DNA is confirmed by restrictionanalysis.

Clones containing the desired constructs are grown overnight (“O/N”) inliquid culture in LB media supplemented with both ampicillin (100 ug/ml)and kanamycin (25 ug/ml).

The O/N culture is used to inoculate a large culture, at a dilution ofapproximately 1:100 to 1:250. The cells are grown to an optical densityat 600 nm (“OD600”) of between 0.4 and 0.6.Isopropyl-B-D-thiogalactopyranoside (“IPTG”) is then added to a finalconcentration of 1 mM to induce transcription from lac repressorsensitive promoters, by inactivating the lacI repressor. Cellssubsequently are incubated further for 3 to 4 hours. Cells are thenharvested by centrifugation and disrupted, by standard methods.Inclusion bodies are purified from the disrupted cells using routinecollection techniques, and protein is solubilized from the inclusionbodies into 8M urea. The 8M urea solution containing the solubilizedprotein is passed over a PD-10 column in 2× phosphate buffered saline(“PBS”), thereby removing the urea, exchanging the buffer and refoldingthe protein. Expressed protein is purified by a further step ofchromatography to remove endotoxin. Then, it is sterile filtered. Thesterile filtered protein preparation is stored in 2×PBS at aconcentration of 95 micrograms per ml.

Example 2 Expression in Mammalian Cells CHO, COS and Others

Most of the vectors used for the transient expression of the cDNAencoding the CD33-like protein in mammalian cells should carry the SV40origin of replication. This allows the replication of the vector to highcopy numbers in cells (e.g. COS cells) which express the T antigenrequired for the initiation of viral DNA synthesis. Any other mammaliancell line can also be utilized for this purpose.

A typical mammalian expression vector contains the promoter element,which mediates the initiation of transcription of mRNA, the proteincoding sequence, and signals required for the termination oftranscription and polyadenylation of the transcript. Additional elementsinclude enhancers, Kozak sequences and intervening sequences flanked bydonor and acceptor sites for RNA splicing. Highly efficienttranscription can be achieved with the early and late promoters fromSV40, the long terminal repeats (LTRs) from retroviruses, e.g. RSV,HTLVI, HIVI and the early promoter of the cytomegalovirus (CMV).However, also cellular signals can be used (e.g. human actin, promoter).Suitable expression vectors for use in practicing the present inventioninclude, for example, vectors such as pSVL and pMSG (Pharmacia, Uppsala,Sweden), pRSVcat (ATCC 37152), pSV2dhfr (ATCC 37146) and pBC12MI (ATCC67109). Mammalian host cells that could be used include, human HeLa,283, H9 and Jurkart cells, mouse NIH3T3 and C127 cells, Cos 1, Cos 7 andCV1 African green monkey cells, quail QC1-3 cells, mouse L cells andChinese hamster ovary cells.

Alternatively, the gene can be expressed in stable cell lines thatcontain the gene integrated into a chromosome. The co-transfection witha selectable marker such as dhfr, gpt, neomycin, hygromycin allows theidentification and isolation of the transfected cells.

The transfected gene can also be amplified to express large amounts ofthe encoded protein. The DHFR (dihydrofolate reductase) is a usefulmarker to develop cell lines that carry several hundred or even severalthousand copies of the gene of interest. Another useful selection markeris the enzyme glutamine synthase (GS) (Murphy et al. Biochem J.227:277-275 (1991), Bebbington et al., Bio/Technology 10:163-175(1992)). Using these markers, the mammalian cells are grown in selectivemedium and the cells with the highest resistance are selected These celllines contain the amplified gene(s) integrated into a chromosome.Chinese hamster ovary (CHO) cells are often used for the production ofproteins.

The expression vectors pC1 and pC4 contain the strong promoter (LTR) ofthe Rous Sarcoma Virus (Cullen et al., Molecular and Cellular Biology,438-4470 (March 1985)) plus a fragment of the CMV-enhancer (Boshart etal., Cell 41:521-530 (1985). Multiple cloning sites, e.g. with therestriction enzyme cleavage sites BamHI, XbaI and Asp718, facilitate thecloning of the gene of interest. The vectors contain in addition the 3′intron, the polyadenylation and termination signal of the ratpreproinsulin gene.

Example 2A Expression of Extracellular Soluble Domain of CD33-LikeProtein in COS Cells

The expression plasmid is made by cloning a cDNA encoding CD33-likeprotein into the expression vector pcDNAI/Amp (which can be obtainedfrom Invitrogen, Inc.). The expression vector pcDNAI/amp contains: (1)an E. coli origin of replication effective for propagation in E. coliand other prokaryotic cell; (2) an ampicillin resistance gene forselection of plasmid-containing prokaryotic cells; (3) an SV40 origin ofreplication for propagation in eukaryotic cells; (4) a CMV promoter, apolylinker, an SV40 intron, and a polyadenylation signal arranged sothat a cDNA conveniently can be placed under expression control of theCMV promoter and operably linked to the SV40 intron and thepolyadenylation signal by means of restriction sites in the polylinker.

A DNA fragment encoding the CD33-like protein extracellular solubledomain and a HA tag fused in frame to its 3′ end are cloned into thepolylinker region of the vector so that recombinant protein expressionis directed by the CMV promoter. The HA tag corresponds to an epitopederived from the influenza hemagglutinin protein described by Wilson etal., Cell 37: 767 (1984). The fusion of the HA tag to the target proteinallows easy detection of the recombinant protein with an antibody thatrecognizes the HA epitope.

The plasmid construction strategy is as follows. The CD33-like proteincDNA of the deposit clone is amplified using primers that containconvenient restriction sites, much as described above for the expressionof CD33-like protein in E. coli. To facilitate detection, purificationand characterization of the expressed CD33-like protein, one of theprimers contains a hemagglutinin tag (“HA tag”) as described above.

Suitable primers include the following, which are used in this example:the 5′ primer CGC GGA TCC GCC ATC ATG CTG CCC CTG CTG CTG 3′ (SEQ IDNO:6) contains the underlined BamHI site and the first 18 nucleotides inthe coding region of the CD33-like protein cDNA having the nucleotidesequence shown at nucleotide position 37-54 in FIGS. 1A-1C (SEQ IDNO:1).

The 3′ primer, containing the underlined XbaI site, stop codon,hemagglutinin tag and being complementary to the nucleotide sequence ofthe last 18 bp preceding the transmembrane domain (bps 1285-1302) shownin FIG. 1 (SEQ ID NO:1), has the following sequence: 5′CGC TCT AGA TCAAGC GTA GTC TGG GAC GTC GTA TGG GTA AGA GCC GCT CTG GGA CCC 3′ (SEQ IDNO:7).

The PCR amplified DNA fragment and the vector, pcDNAI/Amp, are digestedwith BamHI and XbaI and then ligated. The ligation mixture istransformed into E. coli strain SURE (available from Stratagene CloningSystems, 11099 North Torrey Pines Road, La Jolla, Calif. 92037), and thetransformed culture is plated on ampicillin media plates which then areincubated to allow growth of ampicillin resistant colonies. Plasmid DNAis isolated from resistant colonies and examined by restriction analysisand gel sizing for the presence of the CD33-like protein-encodingfragment.

For expression of recombinant CD33-like protein, COS cells aretransfected with an expression vector, as described above, usingDEAE-DEXTRAN, as described, for instance, in Sambrook et al., MolecularCloning: A Laboratory Manual, Cold Spring Laboratory Press, Cold SpringHarbor, N.Y. (1989). Cells are incubated under conditions for expressionof CD33-like protein by the vector. Expression of the CD33-likeprotein/HA fusion protein is detected by radiolabelling andimmunoprecipitation, using methods described in, for example Harlow etal., Antibodies: A Laboratory Manual, 2nd Ed.; Cold Spring HarborLaboratory Press, Cold Spring Harbor, N.Y. (1988). To this end, two daysafter transfection, the cells are labeled by incubation in mediacontaining ³⁵S-cysteine for 8 hours. The cells and the media arecollected, and the cells are washed and the lysed withdetergent-containing RIPA buffer: 150 mM NaCl, 1% NP-40, 0.1% SDS, 1%NP-40, 0.5% DOC, 50 mM TRIS, pH 7.5, as described by Wilson et al. citedabove. Proteins are precipitated from the cell lysate and from theculture media using an HA-specific monoclonal antibody. The precipitatedproteins then are analyzed by SDS-PAGE gels and autoradiography. Anexpression product of the expected size is seen in the cell lysate,which is not seen in negative controls.

Example 2B Expression and Purification of Human CD33-Like Protein Usingthe CHO Expression System

The DNA sequence encoding CD33-like protein in the depositedpolynucleotide is amplified using PCR oligonucleotide primers specificto the carboxyl terminal sequence of the CD33-like protein and to vectorsequences 3′ to the gene. Additional nucleotides containing restrictionsites to facilitate cloning are added to the 5′ and 3′ sequencesrespectively.

For both the full length gene and the nucleotide sequence encoding theextracellular soluble domain, the 5′ primer has the sequence: 5′CGC GGATCC GCC ATC ATG CTG CCC CTG CTG CTG 3′ (SEQ ID NO:9) containing theunderlined BamHI restriction site and the first 18 nucleotides in thecoding region of the CD33-like protein cDNA having the nucleotidesequence shown at nucleotide position 37-54 in FIGS. 1A-1C (SEQ IDNO:1). For the full length gene, the 3′ primer has the sequence: 5′CGCGGT ACC TCA GTG GCT CCT CCA GCC AGG 3′ (SEQ ID NO:8), containing theunderlined Asp718 restriction site and nucleotides of the CD33-likeprotein cDNA having a sequence complementary to the nucleotide sequenceshown at nucleotide position 1713-1730 in FIGS. 1A-1C (SEQ ID NO:1). Forthe extracellular domain, the 3′ primer has the sequence: 5′CGC GGT ACCTCA AGA GCC GCT CTG GGA CCC 3′ (SEQ ID NO:10), containing the underlinedAsp718 restriction site and 18 nucleotides of the CD33-like protein cDNAhaving a nucleotide sequence complementary to the nucleotide sequenceshown at nucleotide position 1285-1302 in FIGS. 1A-1C (SEQ ID NO:1). Therestrictions sites are convenient to restriction enzyme sites in the CHOexpression vectors PC4.

The amplified CD33-like protein DNA and the vector PC4 both are digestedwith BamHI and the digested DNAs then ligated together. Insertion of theCD33-like protein DNA into the BamHI restricted vector placed theCD33-like protein coding region downstream of and operably linked to thevector's promoter. The ligation mixture is transformed into competent E.coli cells using standard procedures. Such procedures are described inSambrook et al., Molecular Cloning: A Laboratory Manual, 2nd Ed.; ColdSpring Harbor Laboratory Press, Cold Spring Harbor, N.Y. (1989).

Example 3 Cloning and Expression of the CD33-Like Protein in aBaculovirus Expression System

The cDNA sequences encoding either the soluble extracellular domain orthe full length CD33-like protein receptor in the deposited clone isamplified using PCR oligonucleotide primers corresponding to the 5′ and3′ sequences of the gene:

The 5′ primer for expression of either the extracellular domain or thefull length protein has the sequence: 5′CGC GGA TCC GCC ATC ATG CTG CCCCTG CTG CTG 3′ (SEQ ID NO:9) containing the underlined BamHI restrictionsite and the first 18 nucleotides in the coding region of the CD33-likeprotein cDNA having the nucleotide sequence shown at nucleotide position37-54 in FIGS. 1A-1C (SEQ ID NO:1). Inserted into an expression vector,as described below, the 5′ end of the amplified fragment encodingCD33-like protein provides an efficient signal peptide. An efficientsignal for initiation of translation in eukaryotic cells, as describedby Kozak, M., J. Mol. Biol. 196:947-950 (1987) is appropriately locatedin the vector portion of the construct.

For the full length gene the 3′ primer has the full length sequence:5′CGC GGT ACC TCA GTG GCT CCT CCA GCC AGG 3′ (SEQ ID NO:8), containingthe underlined Asp718 restriction site and nucleotides of the CD33-likeprotein cDNA having a sequence complementary to the nucleotide sequenceshown at nucleotide position 1713-1730 in FIGS. 1A-1C (SEQ ID NO:1). Forthe extracellular domain, the 3′ primer has the sequence: 5′CGC GGT ACCTCA AGA GCC GCT CTG GGA CCC 3′ (SEQ ID NO:10), containing the underlinedAsp718 restriction site and 18 nucleotides of the CD33-like protein cDNAhaving a nucleotide sequence complementary to the nucleotide sequenceshown at nucleotide position 1285-1302 in FIGS. 1A-1C (SEQ ID NO:1).

The amplified fragment is isolated from a 1% agarose gel using acommercially available kit (“Geneclean®,” BIO 101 Inc., La Jolla,Calif.). The fragment then is digested with BamHI and Asp718 and againis purified on a 1% agarose gel. This fragment is designated herein F2.

The vector pA2 is used to express the CD33-like protein in thebaculovirus expression system, using standard methods, such as thosedescribed in Summers et al, A Manual of Methods for Baculovirus Vectorsand Insect Cell Culture Procedures, Texas Agricultural ExperimentalStation Bulletin No. 1555 (1987). This expression vector contains thestrong polyhedrin promoter of the Autographa californica nuclearpolyhedrosis virus (AcMNPV) followed by convenient restriction sites.For an easy selection of recombinant virus the beta-galactosidase genefrom E. coli is inserted in the same orientation as the polyhedrinpromoter and is followed by the polyadenylation signal of the polyhedringene. The polyhedrin sequences are flanked at both sides by viralsequences for cell-mediated homologous recombination with wild-typeviral DNA to generate viable virus that express the clonedpolynucleotide.

Many other baculovirus vectors could be used in place of pA2, such aspAc373, pVL941 and pAcIM1 provided, as those of skill readily willappreciate, that construction provides appropriately located signals fortranscription, translation, trafficking and the like, such as anin-frame AUG and a signal peptide, as required. Such vectors aredescribed in Luckow et al., Virology 170:31-39, among others.

The plasmid is digested with the restriction enzymes BamHI and Asp718and then is dephosphorylated using calf intestinal phosphatase, usingroutine procedures known in the art. The DNA is then isolated from a 1%agarose gel using a commercially available kit (“Geneclean®” BIO 101Inc., La Jolla, Calif.). This vector DNA is designated herein “V2”.

Fragment F2 and the dephosphorylated plasmid V2 are ligated togetherwith T4 DNA ligase. E. coli HB101 cells are transformed with ligationmix and spread on culture plates. Bacteria are identified that containthe plasmid with the human CD33-like protein gene by digesting DNA fromindividual colonies using BamHI and Asp718 and then analyzing thedigestion product by gel electrophoresis. The sequence of the clonedfragment is confirmed by DNA sequencing. This plasmid is designatedherein pBac/CD33-like protein.

Five μg of the plasmid pBac/CD33-like protein is co-transfected with 1.0μg of a commercially available linearized baculovirus DNA (“BaculoGold®baculovirus DNA”, Pharmingen, San Diego, Calif.), using the lipofectionmethod described by Felgner et al., Proc. Natl. Acad. Sci. USA84:7413-7417 (1987). 1 μg of BaculoGold® virus DNA and 5 μg of theplasmid pBac/CD33-like protein are mixed in a sterile well of amicrotiter plate containing 50 μl of serum-free Grace's medium (LifeTechnologies Inc., Gaithersburg, Md.). Afterwards 10 μl Lipofectin plus90 μl Grace's medium are added, mixed and incubated for 15 minutes atroom temperature. Then the transfection mixture is added drop-wise toSf9 insect cells (ATCC CRL 1711) seeded in a 35 mm tissue culture platewith 1 ml Grace's medium without serum. The plate is rocked back andforth to mix the newly added solution. The plate is then incubated for 5hours at 27° C. After 5 hours the transfection solution is removed fromthe plate and 1 ml of Grace's insect medium supplemented with 10% fetalcalf serum is added. The plate is put back into an incubator andcultivation is continued at 27° C. for four days.

After four days the supernatant is collected and a plaque assay isperformed, as described by Summers and Smith, cited above. An agarosegel with “Blue Gal” (Life Technologies Inc., Gaithersburg, Md.) is usedto allow easy identification and isolation of gal-expressing clones,which produce blue-stained plaques. (A detailed description of a “plaqueassay” of this type can also be found in the user's guide for insectcell culture and baculovirology distributed by Life Technologies Inc.,Gaithersburg, Md., page 9-10).

Four days after serial dilution, the virus is added to the cells. Afterappropriate incubation, blue stained plaques are picked with the tip ofan Eppendorf pipette. The agar containing the recombinant viruses isthen resuspended in an Eppendorf tube containing 200 μl of Grace'smedium. The agar is removed by a brief centrifugation and thesupernatant containing the recombinant baculovirus is used to infect Sf9cells seeded in 35 mm dishes. Four days later the supernatants of theseculture dishes are harvested and then they are stored at 4° C. A clonecontaining properly inserted CD33-like protein receptor is identified byDNA analysis including restriction mapping and sequencing. This isdesignated herein as V-CD33-like protein.

Sf9 cells are grown in Grace's medium supplemented with 10%heat-inactivated FBS. The cells are infected with the recombinantbaculovirus V-CD33-like protein at a multiplicity of infection (“MOI”)of about 2 (about 1 to about 3). Six hours later the medium is removedand is replaced with SF900 II medium minus methionine and cysteine(available from Life Technologies Inc., Gaithersburg, Md.). Forty-twohours later, 5 μCi of 35S-methionine and 5 μCi 35S cysteine (availablefrom Amersham) are added. The cells are further incubated for 16 hoursand then they are harvested by centrifugation, lysed and the labeledproteins are visualized by SDS-PAGE and autoradiography.

Example 4 Tissue Distribution of CD33-Like Protein mRNA Expression

Northern blot analysis was carried out to examine the levels ofexpression of the gene encoding the CD33-like protein in human tissues,using methods described by, among others, Sambrook et al., cited above.A cDNA probe containing the entire nucleotide sequence of the CD33-likeprotein of the present invention (SEQ ID NO:1) was labeled with ³²Pusing the rediprime DNA labeling system (Amersham Life Science),according to manufacturer's instructions. After labeling, the probe waspurified using a CHROMA SPIN-100 column (Clontech Laboratories, Inc.),according to manufacturer's protocol number PT1200-1. The purifiedlabeled probe was then used to examine various human tissues for theexpression of the gene encoding the CD33-like protein.

Multiple Tissue Northern (MTN) blots containing various human tissues(H) or human immune system tissues (IM) were obtained from Clontech andwere examined with labeled probe using ExpressHyb Hybridization Solution(Clontech) according to manufacturer's protocol number PT1190-1.Following hybridization and washing, the blots were mounted and exposedto film at −70° C. overnight, and films developed according to standardprocedures.

As shown in FIG. 3, expression of the gene encoding the CD33-likeprotein of the present invention was highest in hemopoietic tissuesincluding bone marrow, peripheral blood leucocytes and spleen (FIG. 3,lanes 10, 11, 15, respectively), but can also be detected in othertissue such as the lymph node, appendix, lung and placenta (FIG. 3,lanes 14, 12, 5 and 6, respectively).

It will be clear that the invention may be practiced otherwise than asparticularly described in the foregoing description and examples.

Numerous modifications and variations of the present invention arepossible in light of the above teachings and, therefore, are within thescope of the appended claims.

The disclosures of all patents, patent applications, and publicationsreferred to herein are hereby incorporated by reference.

1. An isolated nucleic acid molecule comprising a polynucleotide havinga nucleotide sequence at least 95% identical to a sequence selected fromthe group consisting of: (a) a nucleotide sequence encoding apolypeptide comprising amino acids from about −15 to about 536 in SEQ IDNO:2; (b) a nucleotide sequence encoding a polypeptide comprising aminoacids from about −14 to about 536 in SEQ ID NO:2; (c) a nucleotidesequence encoding a polypeptide comprising amino acids from about 1 toabout 536 in SEQ ID NO:2; (d) a nucleotide sequence encoding apolypeptide having the amino acid sequence encoded by the cDNA clonecontained in ATCC Deposit No. 97521; (e) a nucleotide sequence encodingthe mature CD33-like polypeptide having the amino acid sequence encodedby the cDNA clone contained in ATCC Deposit No. 97521; (f) a nucleotidesequence encoding the CD33-like polypeptide extracellular domain; (g) anucleotide sequence encoding the CD33-like polypeptide transmembranedomain; (h) a nucleotide sequence encoding the CD33-like polypeptideintracellular domain; (i) a nucleotide sequence encoding the CD33-likepolypeptide receptor extracellular and intracellular domains with all orpart of the transmembrane domain deleted; and (j) a nucleotide sequencecomplementary to any of the nucleotide sequences in (a), (b), (c), (d),(e), (f), (g), (h), or (i).
 2. An isolated nucleic acid moleculecomprising a polynucleotide which hybridizes under stringenthybridization conditions to a polynucleotide having a nucleotidesequence identical to a nucleotide sequence in (a), (b), (c), (d), (e),(f), (g), (h), (i), or (j) of claim 1 wherein said polynucleotide whichhybridizes does not hybridize under stringent hybridization conditionsto a polynucleotide having a nucleotide sequence consisting of only Aresidues or of only T residues.
 3. An isolated nucleic acid fragment ofthe polynucleotide of claim 1, wherein said fragment is at least 15nucleotides in length.
 4. A method for making a recombinant vectorcomprising inserting an isolated nucleic acid molecule of claim 1 into avector.
 5. A recombinant vector produced by the method of claim
 4. 6. Amethod of making a recombinant host cell comprising introducing therecombinant vector of claim 5 into a host cell.
 7. A recombinant hostcell produced by the method of claim
 6. 8. A recombinant method forproducing a CD33-like polypeptide, comprising culturing the recombinanthost cell of claim 7 under conditions such that said polypeptide isexpressed and recovering said polypeptide.
 9. An isolated CD33-likepolypeptide having an amino acid sequence at least 95% identical to asequence selected from the group consisting of: (a) amino acids fromabout −15 to about 536 in SEQ ID NO:2; (b) amino acids from about −14 toabout 536 in SEQ ID NO:2; (c) amino acids from about 1 to about 536 inSEQ ID NO:2; (d) the amino acid sequence of the CD33-like polypeptidehaving the amino acid sequence encoded by the cDNA clone contained inATCC Deposit No. 97521; (e) the amino acid sequence of the matureCD33-like polypeptide having the amino acid sequence encoded by the cDNAclone contained in ATCC Deposit No. 97521; (f) the amino acid sequenceof the CD33-like polypeptide extracellular domain; (g) the amino acidsequence of the CD33-like polypeptide transmembrane domain; (h) theamino acid sequence of the CD33-like polypeptide intracellular domain;(i) the amino acid sequence of the CD33-like polypeptide extracellularand intracellular domains with all or part of the transmembrane domaindeleted; and (j) the amino acid sequence of an epitope-bearing portionof any one of the polypeptides of (a), (b), (c), (d), (e), (f), (g),(h), or (i).
 10. An isolated antibody that binds specifically to aCD33-like polypeptide of claim
 9. 11. An isolated nucleic acid moleculecomprising a polynucleotide encoding a CD33-like polypeptide wherein,except for at least one conservative amino acid substitution, saidpolypeptide has a sequence selected from the group consisting of: (a) anucleotide sequence encoding a polypeptide comprising amino acids fromabout −15 to about 536 in SEQ ID NO:2; (b) a nucleotide sequenceencoding a polypeptide comprising amino acids from about −14 to about536 in SEQ ID NO:2; (c) a nucleotide sequence encoding a polypeptidecomprising amino acids from about 1 to about 536 in SEQ ID NO:2; (d) anucleotide sequence encoding a polypeptide having the amino acidsequence encoded by the cDNA clone contained in ATCC Deposit No. 97521;(e) a nucleotide sequence encoding the mature CD33-like polypeptidehaving the amino acid sequence encoded by the cDNA clone contained inATCC Deposit No. 97521; (f) a nucleotide sequence encoding the CD33-likepolypeptide extracellular domain; (g) a nucleotide sequence encoding theCD33-like polypeptide transmembrane domain; (h) a nucleotide sequenceencoding the CD33-like polypeptide intracellular domain; (i) anucleotide sequence encoding the CD33-like polypeptide receptorextracellular and intracellular domains with all or part of thetransmembrane domain deleted; and (j) a nucleotide sequencecomplementary to any of the nucleotide sequences in (a), (b), (c), (d),(e), (f), (g), (h), or (i).
 12. An isolated CD33-like polypeptidewherein, except for at least one conservative amino acid substitution,said polypeptide has a sequence selected from the group consisting of:(a) amino acids from about −15 to about 536 in SEQ ID NO:2; (b) aminoacids from about −14 to about 536 in SEQ ID NO:2; (c) amino acids fromabout 1 to about 536 in SEQ ID NO:2; (d) the amino acid sequence of theCD33-like polypeptide having the amino acid sequence encoded by the cDNAclone contained in ATCC Deposit No. 97521; (e) the amino acid sequenceof the mature CD33-like polypeptide having the amino acid sequenceencoded by the cDNA clone contained in ATCC Deposit No. 97521; (f) theamino acid sequence of the CD33-like polypeptide extracellular domain;(g) the amino acid sequence of the CD33-like polypeptide transmembranedomain; (h) the amino acid sequence of the CD33-like polypeptideintracellular domain; (i) the amino acid sequence of the CD33-likepolypeptide extracellular and intracellular domains with all or part ofthe transmembrane domain deleted; and (j) the amino acid sequence of anepitope-bearing portion of any one of the polypeptides of (a), (b), (c),(d), (e), (f), (g), (h), or (i).
 13. An isolated nucleic acid moleculecomprising a polynucleotide having a sequence at least 95% identical toa sequence selected from the group consisting of: (a) the nucleotidesequence of clone HTOBA14R (SEQ ID NO:11); and (b) the nucleotidesequence of a portion of the sequence shown in SEQ ID NO:1 wherein saidportion comprises at least 50 contiguous nucleotides from nucleotide 1to nucleotide 1898; and (c) a nucleotide sequence complementary to anyof the nucleotide sequences in (a) or (b) above.